I have always wanted a nosql database that was purpose built for storing large volumes of nested/threaded comments. Implementation would probably be done in java because that is what I am best at. I really like how ElasticSearch is dead simple to set up a cluster and throw data into it,I want my product to share those same qualities. Here are the features I have in mind:
1) auto/manual sharding across clusters
2) auto/manual indexing across clusters
3) full text search (probably via lucene or elasticSearch)
4) REST/JSON API
5) retrieve any comment by ID
6) comments can be retrieved with or without child nodes
7) comment trees can be retrieved with a specified depth
8) comment tree can be retrieved can be filtered by time or rank
9) entire comment trees can be re-parented.
What I'm looking for are exceptional pieces of code or specific algorithms that I can study before digging into this project. Can anyone suggest a few places to get started?