09-08-10 banq


1.如果正在建立类似Lotuc notes应用,数据会暂时离线几个小时,然后再上线,这种模式适合Couch,和Couch的MVCC模式(Multiversion concurrency control)完全符合。







[该贴被admin于2009-08-10 17:55修改过]


2009-08-10 17:45
很著名的一篇NoSQL: If Only It Was That Easy



The biggest thing in web apps since “rails can’t scale” is this idea that “your rdbms doesn’t scale.”(在Web最大的事情就是Rails并不具有伸缩性,因为Rails是依赖数据库的,而数据库是不可伸缩性,这句话我个人好像以前也说过). This has gone so far as to be dubbed the coming of age for “nosql” with lots of blog posts and even a meetup.(未来被称为无数据库时代) Indeed, there are many promising key-value stores, distributed key-value stores, document oriented dbs, and column oriented db projects on the radar( 由此诞生了很多key-value存储模式,个人推荐的缓存也是一种key-value存储模式, key-value存储模式是可伸缩的). This is *definitely* a great thing for the web application scene and this level of variety will definitely open doors for organizations large and small in the near and long term.

However, along with these great tools, an attitude that “the rdbms is dead” has popped up, and while that may be true in the long run, in the short term, it’s definitely premature.(随着这些伟大的工具推出和应用,一种观点说, “关系型数据库管理系统已经死了”,而这也许是真实的,从长远来看,在短期内,这肯定为时过早。)


What is scaling?

First, lets get a couple things straight:

to scale (third-person singular simple present scales, present participle scaling, simple past and past participle scaled)

(transitive) To change the size of, maintaining proportion.

We should scale that up by a factor of 10.

(transitive) To climb.

Hilary and Norgay were the first known to have scaled Everest.

(intransitive) (computing) To tolerate significant increases in throughput or other potentially limiting factors.

That architecture won’t scale to real-world environments.

The first thing we need to agree on is what it means to scale. According to our definitions above, it’s to change the size while maintaining proportions and in CS this usually means to increase throughput. What scaling isn’t: performance.

performance (plural performances)The act of performing; carrying into execution or action; execution; achievement; accomplishment; representation by action; as, the performance of an undertaking of a duty.In Computer science: The amount of useful work accomplished by a computer system compared to the time and resources used. Better Performance means more work accomplished in shorter time and/or using less resources

[该贴被admin于2009-08-10 17:54修改过]

2009-08-10 17:47
In reality, scaling doesn’t have anything to do with being fast. It has only to do with size. If your request takes 12 seconds, it doesn’t matter, it only matters that you can do 1 per second, 10 per second, 100 per second, 1000 per second, etc. of that 12 second query.

Now, scaling and performance do relate in that typically, if something is performant, it may not actually need to scale. Your upper limit may be high enough you don’t even need to worry about scaling. The problem with rails is not that it doesn’t scale (it happens to scale pretty easily), it’s that you have to scale it almost immediately. The problem with RDBMS isn’t that they don’t scale, it’s that they are incredibly hard to scale. Ssharding is the most obvious way to scale things, and sharding multiple tables which can be accessed by any column pretty quickly gets insane. Furthermore, you might be able to use something other than a RDBMS that you won’t need to scale because it’s more performant or efficient at doing the work you’re currently doing in a RDBMS.

So NoSQL then. . .

At my previous job, my co-workers and I evaluated most of the current nosql solutionsto varying degrees. All of the projects have been evaluated for both use as simple “tables” of data, such as storing a single type of key/value data, as well as a document db to be used as our primary data store. They have a curious data set which consists of a single set of parent objects with many child objects that relate 1-1 or 1-n with this set of objects (I call these primary objects). We then have a secondary set of objects that store changes made to both the primary objects that we primarily use for auditing. Our current db setup is standard master-slave replication in mysql with 1 master and up to 3 slaves depending on usage. The primary objects are mostly changed via UPDATES and the secondary objects are all inserted at the end of the tables. We also have a few random other data sets which loosely relate to the primary parent objects.

To get a couple out of the way, I’m not going to cover memcached (because it’s not a db), memcachedb (general sentiment that it is immature), couchdb (because we didn’t want to use map/reduce to pull information and there are questions about it’s performance and replication), dynomite (seen as immature), Amazon SimpleDB (because of size limits), or Lightcloud (seen as immature). As far as the ones that we I deemed immature, I’m sure there are people out there using these things and having a great time, but our research into them, and word of mouth from others who have tried them kept us from really going deep.

Tokyo *


type: Key/Value store with Full Text Search*

Conclusion: Doesn’t scale.

We liked Tokyo Tyrant so much, we put it in production. In fact, every request to hits Tokyo. One of the uses is as a persistent memcached replacement for caching 10 million+ wiki pages (as a json document of all the pieces of our page, which comes out to around 51gb(edited) of data), and it works great. It runs on a single server, it serves up a single type of data, very quickly, and has been a pleasure to use. We keep other ancillary data sets on some other servers too, and it’s great for this. Tokyo Tyrant is a great example of very performant software, but it doesn’t scale. If you’d like to make it scale, it’s not very hard, you scale it exactly like Memcached (by some sort of application side hashing of keys). You can have as many servers as you’d like, but you can’t easily add servers to a cluster (increase in size while maintaining proportion) and therefore, you can’t tolerate significant increases in throughput. The good news it that here “significant” is relatively massive, and you probably won’t need to scale it any time soon.

2009-08-10 17:48

We tried to insert 160mil 2k -20k documents into a single Tokyo Tyrant server, and performance quickly dropped off and kept going down. You could have had a nice holiday skiing on the graph of inserts/sec. This is pretty typical of anything that you write to a disk. The more you write, the slower it goes. We could distribute the write easily, because Tokyo doesn’t scale.

Tokyo does support replication, and a few other great things, but these don’t make for scaling.

* We don’t use the full text search, so I can’t comment there.



type: Key/Value store with collections and counters

Conclusion: Doesn’t scale.

Redis is also awesome like Tokyo. I would say the two are pretty comparable as simple k/v stores. The counters and collections are AWESOME, and if I was still at AboutUs, I think I’d be pushing to move a couple pieces of the infrastructure to Redis. I have less to say about Redis because I haven’t used it in production, but it looks great if it fits your bill. Does it scale? No. It, just as memecached and tokyo tyrant, can do sharding by handling it in the client, and therefore, you can’t just start adding new servers and increase your throughput. Nor is it fault tolerant. Your redis server dies, and there goes that data. And just as tokyo tyrant and memcached, you probably won’t ever need to try to scale it. Redis also supports replication.

Project Voldemort


type: Distributed Key/Value store

Conclusion: Scales!

Voldemort is a very cool project that comes out of LinkedIn. They seem to even be providing a full time guy doing development and support via a mailing list. Kudos to them, because Voldemort, as far as I can tell, is great. Best of all, it scales. You can add servers to the cluster, you don’t do any client side hashing, throughput is increased as the size of the cluster increases. As far as I can tell, you can handle any increase in requests by adding servers as well as those servers being fault tolerant, so a dead server doesn’t bring down the cluster.

Voldemort does have a downside for me, because I primarily use ruby and the provided client is written in java, so you either have to use JRuby (which is awesome but not always realistic) or Facebook Thrift to interact with Voldemort. This means thrift has to be compiled on all of your machines, and since Thrift uses Boost C++ library, and Boost C++ library is both slow and painful to compile, deployment of Voldemort apps is increased significantly.

Voldemort is also intersting because it has pluggable data storage backend and the bulk of it is mostly for the sharding and fault tolerance and less about data storage. Voldemort might actually be a good layer on top of Redis or Tokyo Cabinet some day.

Voldemort, it should be noted, is also only going to be worth using if you actually need to spread your data out over a cluster of servers. If your data fits on a single server in Tokyo Tyrant, you are not going to gain anything by using Voldemort. Voldemort however, might be seen as a good migration path from Tokyo * when you do hit that wall were performance isn’t enough.



type: Document Database

Conclusion: Doesn’t scale (yet!)

MongoDB is not a key/value store, it’s quite a bit more. It’s definitely not a RDBMS either. I haven’t used MongoDB in production, but I have used it a little building a test app and it is a very cool piece of kit. It seems to be very performant and either has, or will have soon, fault tolerance and auto-sharding (aka it will scale). I think Mongo might be the closest thing to a RDBMS replacement that I’ve seen so far. It won’t work for all data sets and access patterns, but it’s built for your typical CRUD stuff. Storing what is essentially a huge hash, and being able to select on any of those keys, is what most people use a relational database for. If your DB is 3NF and you don’t do any joins (you’re just selecting a bunch of tables and putting all the objects together, AKA what most people do in a web app), MongoDB would probably kick ass for you.

Oh, and did I mention that, of all the NoSQL options out there, MongoDB is the one of the only ones being developed as a business with commercial support available? If you’re dealing with lots of other people’s data, and have a business built on the data in your DB, this isn’t trivial.

On a side note, if you use Ruby, check out MongoMapper for very easy and nice to use ruby access.

2009-08-10 17:48



type: Column Database

Conclusion: Probably scales

Cassandra is another very promising project that I wouldn’t use yet. Cassandra came out of Facebook and seems to be in use there powering search in your inbox. It’s described as a distributed key/value store, where values can be collections of other key/values (called column families). It is definitely supposed to scale, and probably does at Facebook, by simply adding another machine (they will hook up with each other using a gossip protocol), but the OSS version doesn’t seem to support some key things, like loosing a machine all together. They are also in the midst of changing how the basic datastructures are stored on disk, and I don’t know that I’d trust my data to this sexy db until those things are worked out, which should be soon.

Cassandra also seems like a contender for a primary database or RDBMS replacement, as soon as it matures. The scaling possibilities are very attractive, and complex data structures shouldn’t be hard to model in it. I’m not going to go any deeper on cassandra because Evan Weaver did a great job of it here, but I will say that Cassandra is very promising and we were (when I left) looking at it very closely at

Amazon S3


type: key/value store

Conclusion: Scales amazingly well

You’re probably all like “What?!?”. But guess what, S3 is a killer key/value store. It is not as performant as any of the other options, but it scales *insanely* well. It scales so well, you don’t do anything. You just keep sticking shit in it, and it keeps pumping it out. Sometimes it’s faster than other times, but most of the time it’s fast enough. In fact, it’s faster than hitting mysql with 10 queries (for us). S3 is my favorite k/v data store of any out there.



type: RDBMS

Conclusion: Doesn’t Scale

Now you are probably like “Dude, what?!? You got some SQL in this NoSQL article”. I’ve got news for you guys, mysql is a pretty bad ass key/value store. It can do everything that Tokyo and Redis can do, and it really isn’t that much slower. In fact, for some data sets, I’ve seen MySQL perform ALOT faster than Tokyo Tyrant (I’ll post my findings in a follow up). For most applications (and say, FriendFeed), MySQL is plenty fast and it’s familiar and ubiquitous. I’m sure the NoSQL guys reading this will all be saying “Yeah, but we are dealing with more data than MySQL can handle”. Well, you might be dealing with more data than mysql used as a RDBMS might be able to handle, but it’s just as easy or easier to shard MySQL as it is Tokyo or Redis, and it’s hard to argue that they can win on many other points.

4Go 1 2 3 4 下一页