NoSQL专题

Redis Cluster快速安装指南

本文是对带有Redis Cluster的Redis Server 3.0.0版本安装使用过程:

下面是试验的服务器:

# 212.71.252.54  / 192.168.171.141 / node1
# 176.58.103.254 / 192.168.171.142 / node2
# 178.79.153.89  / 192.168.173.227 / node3

本地主机 (在每个服务器上):

# local hosts
192.168.171.141 node1
192.168.171.142 node2
192.168.173.227 node3

和远程主机 (在自己的PC上):

# remote hosts
212.71.252.54  node1
176.58.103.254 node2
178.79.153.89  node3

首先,让我们下载并在每个节点解压.

mkdir build && cd build
wget http://download.redis.io/releases/redis-3.0.0.tar.gz
tar -xvzf redis-3.0.0.tar.gz
cd redis-3.0.0/

现在我们可以build :

apt-get install -y make gcc build-essential
make MALLOC=libc # also jemalloc can be used

好了,可以运行测试:

apt-get install -y tk8.5 tcl8.5
make test
# a lot of output, should be green

最后,我们能够启动redis

src/redis-server ./redis.conf

集群配置

修改每个节点配置如下:

# redis.conf
bind node1 # for node1
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 5000

下面你将看到如下提示:

29338:M 13 Apr 21:05:00.214 * No cluster configuration found, I'm a1eec932d923b55e23a5fe6a488ed7a97e27c826
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.0.0 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in cluster mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 29338
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

29338:M 13 Apr 21:05:00.246 # Server started, Redis version 3.0.0
29338:M 13 Apr 21:05:00.247 * DB loaded from disk: 0.000 seconds
29338:M 13 Apr 21:05:00.247 * The server is now ready to accept connections on port 6379

太多调试信息了,只有一句是最重要的:

No cluster configuration found, I'm a1eec932d923b55e23a5fe6a488ed7a97e27c826

这表示我们的redis服务器正在运行在cluster mode

(… 按照上面步骤在每个节点上如上操作 …)

 

连接节点

现在我们有三个节点:

node1:6379
node2:6379
node3:6379

它们都处于失联状态,我们现在开始配置将它们彼此连接起来,Redis有一个连接节点的工具称为redis-trib.rb. 对了,它是一个ruby文件,需要 redis gem被安装。

➜  redis-3.0.0 src/redis-trib.rb                    
Usage: redis-trib <command> <options> <arguments ...>

  create          host1:port1 ... hostN:portN
                  --replicas <arg>
  check           host:port
  fix             host:port
  reshard         host:port
                  --from <arg>
                  --to <arg>
                  --slots <arg>
                  --yes
  add-node        new_host:new_port existing_host:existing_port
                  --slave
                  --master-id <arg>
  del-node        host:port node_id
  set-timeout     host:port milliseconds
  call            host:port command arg arg .. arg
  import          host:port
                  --from <arg>
  help            (show this help)

For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.

因为某些原因,这个工具不支持主机名hostnames, 我们只能手工传递IP

➜  redis-3.0.0 src/redis-trib.rb create 192.168.171.141:6379 192.168.171.142:6379 192.168.173.227:6379

>>> Creating cluster
Connecting to node 192.168.171.141:6379: OK
Connecting to node 192.168.171.142:6379: OK
Connecting to node 192.168.173.227:6379: OK
>>> Performing hash slots allocation on 3 nodes...
Using 3 masters:
192.168.171.141:6379
192.168.171.142:6379
192.168.173.227:6379
M: 78a5bbdcd545848be8a66126a71dc69dd6d23bc4 192.168.171.141:6379
   slots:0-5460 (5461 slots) master
M: 1f6ed2478b461539f76b0b627de2e1b8565df719 192.168.171.142:6379
   slots:5461-10922 (5462 slots) master
M: 7a092b06c8c75e98176b7612e74d2e89e8b3eda7 192.168.173.227:6379
   slots:10923-16383 (5461 slots) master
Can I set the above configuration? (type 'yes' to accept):

看上去很美,每个节点负责数据的1/3,键入 ‘yes’…

>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join.
>>> Performing Cluster Check (using node 127.0.0.1:7001)
M: 78a5bbdcd545848be8a66126a71dc69dd6d23bc4 192.168.171.141:6379
   slots:0-5460 (5461 slots) master
M: 1f6ed2478b461539f76b0b627de2e1b8565df719 192.168.171.142:6379
   slots:5461-10922 (5462 slots) master
M: 7a092b06c8c75e98176b7612e74d2e89e8b3eda7 192.168.173.227:6379
   slots:10923-16383 (5461 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

完成了

 

测试集群

下面是如何检查集群状态:

➜  redis-3.0.0 src/redis-cli -h node2 cluster nodes
7a092b06c8c75e98176b7612e74d2e89e8b3eda7 node1:6379 master - 0 1428949630273 3 connected 10923-16383
78a5bbdcd545848be8a66126a71dc69dd6d23bc4 node2:6379 myself,master - 0 0 1 connected 0-5460
1f6ed2478b461539f76b0b627de2e1b8565df719 node3:6379 master - 0 1428949629272 2 connected 5461-10922

每个单个节点都认识彼此,先前命令能在任何节点被执行。

 

Benchmarks

让我们设置 redis-rb-cluster (https://github.com/antirez/redis-rb-cluster)

➜  build wget https://github.com/antirez/redis-rb-cluster/archive/master.zip
➜  build unzip master.zip
➜  build cd redis-rb-cluster-master

我们有文件e example.rb 比较单调,只是将随机key写入我们的集群然后打印它:

➜  redis-rb-cluster-master ruby example.rb
1
2
3
...

另外一个例子比较有趣:

➜  redis-rb-cluster-master ruby consistency-test.rb node1 6379
850 R (0 err) | 850 W (0 err) | 
4682 R (0 err) | 4682 W (0 err) | 
8490 R (0 err) | 8490 W (0 err) | 
12196 R (0 err) | 12196 W (0 err) | 
15785 R (0 err) | 15785 W (0 err) |

这个工具将大量的数据写入到Redis并检查已写入的数据是否依然存在。

 

测试失败恢复 failover

再次运行 consistency-test.rb 并且杀死其他节点上的:

➜  redis-3.0.0 src/redis-cli -h node2 debug segfault

这里是每次运行得到输出:

70273 R (0 err) | 70273 W (0 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 9515 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 9515 127.0.0.1:7002)
72378 R (1 err) | 72378 W (1 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 9650 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 9650 127.0.0.1:7002)
72379 R (2 err) | 72379 W (2 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 5797 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 5797 127.0.0.1:7002)
72380 R (3 err) | 72380 W (3 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 9772 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 9772 127.0.0.1:7002)
72384 R (4 err) | 72384 W (4 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 10245 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 10245 127.0.0.1:7002)
72385 R (5 err) | 72385 W (5 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 7376 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 7376 127.0.0.1:7002)
72385 R (6 err) | 72385 W (6 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 6781 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 6781 127.0.0.1:7002)
72396 R (7 err) | 72396 W (7 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 10275 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 10275 127.0.0.1:7002)
72401 R (8 err) | 72401 W (8 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 8639 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 8639 127.0.0.1:7002)
72402 R (9 err) | 72402 W (9 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 8173 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 8173 127.0.0.1:7002)
72402 R (10 err) | 72402 W (10 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 9525 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 9525 127.0.0.1:7002)
72403 R (11 err) | 72403 W (11 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 9346 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 9346 127.0.0.1:7002)
72406 R (12 err) | 72406 W (12 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 6391 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 6391 127.0.0.1:7002)
72411 R (13 err) | 72411 W (13 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 6353 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 6353 127.0.0.1:7002)
72413 R (14 err) | 72413 W (14 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 10245 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 10245 127.0.0.1:7002)
72418 R (15 err) | 72418 W (15 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 6438 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 6438 127.0.0.1:7002)
72422 R (16 err) | 72422 W (16 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 6826 127.0.0.1:7002)
Writing: Too many Cluster redirections? (last error: MOVED 6826 127.0.0.1:7002)
72423 R (17 err) | 72423 W (17 err) | 
Reading: Too many Cluster redirections? (last error: MOVED 9713 127.0.0.1:7002)
Writing: CLUSTERDOWN The cluster is down
Reading: CLUSTERDOWN The cluster is down
Reading: CLUSTERDOWN The cluster is down
Writing: CLUSTERDOWN The cluster is down
72423 R (295 err) | 72423 W (295 err) | 
72423 R (2219 err) | 72423 W (2219 err) | 
Reading: CLUSTERDOWN The cluster is down
Writing: CLUSTERDOWN The cluster is down
Reading: CLUSTERDOWN The cluster is down
Writing: CLUSTERDOWN The cluster is down
72423 R (4186 err) | 72423 W (4186 err) | 
Reading: CLUSTERDOWN The cluster is down
Writing: CLUSTERDOWN The cluster is down
72423 R (6190 err) | 72423 W (6190 err) | 
Writing: CLUSTERDOWN The cluster is down
72423 R (8207 err) | 72423 W (8207 err) | 
Reading: CLUSTERDOWN The cluster is down
Reading: CLUSTERDOWN The cluster is down
Writing: CLUSTERDOWN The cluster is down

正如你看到,集群失败了,错误数量不断增加,最后,我们的集群被破坏了。

➜  redis-3.0.0 src/redis-cli -h node3 ping
PONG
➜  redis-3.0.0 src/redis-cli -h node3 get 'test'
(error) CLUSTERDOWN The cluster is down

在手工运行第一个节点后集群又起来了:

# running node2 manually...
➜  redis-3.0.0  src/redis-cli -h  get qwe
(error) MOVED 757 127.0.0.1:7001
➜  redis-3.0.0  src/redis-cli -p 7001 get qwe
(nil)

当然这里是有些问题Bug,估计会被Fix。

 

总结

按照 https://github.com/antirez/redis-rb-cluster#redis-rb-cluster 在进入生产环境还需要很多事情要准备,在这里我们只是快速简单浏览一下大概过程。

 

Redis安装

使用Spring Data + Redis实现缓存