原来oojdon尽看翻墙的先进知识。图和网址都无法访问,呵呵。

2010年05月12日 10:55 "banq"的内容
原来OOjdon尽看翻墙的先进知识。图和网址都无法访问,呵呵 ...

贴上来才知道不能访问,我把图片通过上传贴出来
摘录
In addition to CAP configurations, another significant way data management systems vary is by the data model they use: relational, key-value, column-oriented, or document-oriented (there are others, but these are the main ones).

Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACIDity and joins are considered relational.
Key-value systems basically support get, put, and delete operations based on a primary key.
Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations much easier.
Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems.

摘录
Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of CA systems include:

Traditional RDBMSs like Postgres, MySQL, etc (relational)
Vertica (column-oriented)
Aster Data (relational)
Greenplum (relational)
Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across partitioned nodes. Examples of CP systems include:

BigTable (column-oriented/tabular)
Hypertable (column-oriented/tabular)
HBase (column-oriented/tabular)
MongoDB (document-oriented)
Terrastore (document-oriented)
Redis (key-value)
Scalaris (key-value)
MemcacheDB (key-value)
Berkeley DB (key-value)
Available, Partition-Tolerant (AP) Systems achieve "eventual consistency" through replication and verification. Examples of AP systems include:

Dynamo (key-value)
Voldemort (key-value)
Tokyo Cabinet (key-value)
KAI (key-value)
Cassandra (column-oriented/tabular)
CouchDB (document-oriented)
SimpleDB (document-oriented)
Riak (document-oriented)



[该贴被oojdon于2010-05-12 11:00修改过]


我大概翻译一下:现在数据库有下面几种类型(好像少了图graph数据库)
1.关系数据库:使用ACID事务锁,使用Join表达关系。
2.Key-value:基于主键的get, put, 和 delete 操作。
3.Column-oriente面向列:基于表以列存储,但不使用Join(在应用程序中自己实现),适合聚合。
4.Document-oriented面向文本:数据以JSON或 XML格式保存,但无Join,易于保存对象数据。

CAP原理,可以对这几大数据库进行分类。分为三大类:
CA:追求高一致性,高准确,就象有道友都担心内存会写错如何回滚,这就属于追求CP类型。传统关系数据库如MySQL Oracle等属于这类,它的问题是:处理分区备份复制很麻烦,也就是分布式存储比较麻烦。

CP:抓住一致性和分区容错,当跨网络分区保持数据一致性时,可用性有些问题,客户端读写性能差一些。:
BigTable (column-oriented/tabular)
Hypertable (column-oriented/tabular)
HBase (column-oriented/tabular)
MongoDB (document-oriented)
Terrastore (document-oriented)
Redis (key-value)
Scalaris (key-value)
MemcacheDB (key-value)
Berkeley DB (key-value)


AP:抓住可用性和分区容错,牺牲一定的一致性,也就是说为追求最终一致性,BASE思想。适合社会化媒体。
Dynamo (key-value)
Voldemort (key-value)
Tokyo Cabinet (key-value)
KAI (key-value)
Cassandra (column-oriented/tabular)
CouchDB (document-oriented)
SimpleDB (document-oriented)
Riak (document-oriented)

总体来说:CA - CP - AP 是按照一致性要求不断降低排列。

[该贴被banq于2010-05-12 11:38修改过]
[该贴被banq于2010-05-12 11:39修改过]
[该贴被banq于2010-05-12 11:43修改过]

CAP原理把现有的各种类型的数据库给统一起来,让我们更清晰的知道,在什么场景下该使用什么样类型的数据库。具体问题具体分析的老方法还是一样适用的~

此文直接有PDF版本,,何不一起转来.