We tried to insert 160mil 2k -20k documents into a single Tokyo Tyrant server, and performance quickly dropped off and kept going down. You could have had a nice holiday skiing on the graph of inserts/sec. This is pretty typical of anything that you write to a disk. The more you write, the slower it goes. We could distribute the write easily, because Tokyo doesn’t scale.
Tokyo does support replication, and a few other great things, but these don’t make for scaling.
* We don’t use the full text search, so I can’t comment there.
Redis
url: http://code.google.com/p/redis/
type: Key/Value store with collections and counters
Conclusion: Doesn’t scale.
Redis is also awesome like Tokyo. I would say the two are pretty comparable as simple k/v stores. The counters and collections are AWESOME, and if I was still at AboutUs, I think I’d be pushing to move a couple pieces of the infrastructure to Redis. I have less to say about Redis because I haven’t used it in production, but it looks great if it fits your bill. Does it scale? No. It, just as memecached and tokyo tyrant, can do sharding by handling it in the client, and therefore, you can’t just start adding new servers and increase your throughput. Nor is it fault tolerant. Your redis server dies, and there goes that data. And just as tokyo tyrant and memcached, you probably won’t ever need to try to scale it. Redis also supports replication.
Project Voldemort
url: http://project-voldemort.com/
type: Distributed Key/Value store
Conclusion: Scales!
Voldemort is a very cool project that comes out of LinkedIn. They seem to even be providing a full time guy doing development and support via a mailing list. Kudos to them, because Voldemort, as far as I can tell, is great. Best of all, it scales. You can add servers to the cluster, you don’t do any client side hashing, throughput is increased as the size of the cluster increases. As far as I can tell, you can handle any increase in requests by adding servers as well as those servers being fault tolerant, so a dead server doesn’t bring down the cluster.
Voldemort does have a downside for me, because I primarily use ruby and the provided client is written in java, so you either have to use JRuby (which is awesome but not always realistic) or Facebook Thrift to interact with Voldemort. This means thrift has to be compiled on all of your machines, and since Thrift uses Boost C++ library, and Boost C++ library is both slow and painful to compile, deployment of Voldemort apps is increased significantly.
Voldemort is also intersting because it has pluggable data storage backend and the bulk of it is mostly for the sharding and fault tolerance and less about data storage. Voldemort might actually be a good layer on top of Redis or Tokyo Cabinet some day.
Voldemort, it should be noted, is also only going to be worth using if you actually need to spread your data out over a cluster of servers. If your data fits on a single server in Tokyo Tyrant, you are not going to gain anything by using Voldemort. Voldemort however, might be seen as a good migration path from Tokyo * when you do hit that wall were performance isn’t enough.
MongoDB
url: http://www.mongodb.org
type: Document Database
Conclusion: Doesn’t scale (yet!)
MongoDB is not a key/value store, it’s quite a bit more. It’s definitely not a RDBMS either. I haven’t used MongoDB in production, but I have used it a little building a test app and it is a very cool piece of kit. It seems to be very performant and either has, or will have soon, fault tolerance and auto-sharding (aka it will scale). I think Mongo might be the closest thing to a RDBMS replacement that I’ve seen so far. It won’t work for all data sets and access patterns, but it’s built for your typical CRUD stuff. Storing what is essentially a huge hash, and being able to select on any of those keys, is what most people use a relational database for. If your DB is 3NF and you don’t do any joins (you’re just selecting a bunch of tables and putting all the objects together, AKA what most people do in a web app), MongoDB would probably kick ass for you.
Oh, and did I mention that, of all the NoSQL options out there, MongoDB is the one of the only ones being developed as a business with commercial support available? If you’re dealing with lots of other people’s data, and have a business built on the data in your DB, this isn’t trivial.
On a side note, if you use Ruby, check out MongoMapper for very easy and nice to use ruby access.