RocketMQ高可用集群部署实战:3主3从同步双写架构
一、RocketMQ集群架构概述
1.1 单节点模式
存在单点故障风险,Broker服务一旦中断将导致整个消息服务不可用,生产环境不推荐此种部署方式。
1.2 多Master架构
集群内全部采用Master节点组成,例如2个或3个Master节点。
优势:
- 配置简洁,性能表现优异
- 单个Master节点故障不会影响整体服务
- 配合RAID10磁盘阵列使用时,即便物理机无法恢复,由于磁盘的高可靠性,消息数据不会丢失(异步刷盘可能丢失少量消息,同步刷盘可保证零丢失)
不足:
- 故障节点恢复前,该节点上未被消费的消息暂时无法访问
- 消息的时效性会受到一定影响
1.3 多Master多Slave异步复制模式
每个Master节点配置一个Slave节点,形成多组Master-Slave配对,采用异步复制机制,主备节点间存在毫秒级延迟。
优势:
- 即使磁盘发生故障,消息丢失量极低,且不影响消息时效性
- Master节点故障后,消费者可自动切换至Slave节点继续消费,对应用层完全透明
- 性能与Master模式基本持平
不足:
- Master节点故障且磁盘损坏的极端情况下,可能会丢失少量消息
1.4 多Master多Slave同步双写模式
每个Master节点配置一个Slave节点,采用同步双写机制,主备节点均成功写入后才向应用返回成功响应。
优势:
- 消除单点故障,数据和服务均具备高可用性
- Master节点故障时,消息无延迟,服务可用性和数据可用性均达到最高水平
不足:
- 性能相比异步复制模式略低,约为10%的性能损耗
- 单条消息的响应时间会稍有增加
- 当前版本Master故障后,Slave无法自动升级为Master,后续版本将支持自动切换功能
二、基础设施准备
2.1 节点规划与网络配置
本方案采用3个Master节点加3个Slave节点的部署架构,各节点角色与IP地址分配如下:
| 节点角色 | 主机名 | IP地址 |
|---|---|---|
| NameServer | rocketmq_master_01 | 172.16.150.131 |
| NameServer | rocketmq_master_02 | 172.16.150.132 |
| NameServer | rocketmq_master_03 | 172.16.150.133 |
| Master节点A | rocketmq_master_01 | 172.16.150.131 |
| Master节点B | rocketmq_master_02 | 172.16.150.132 |
| Master节点C | rocketmq_master_03 | 172.16.150.133 |
| Slave节点A | rocketmq_slave_01 | 172.16.150.134 |
| Slave节点B | rocketmq_slave_02 | 172.16.150.135 |
| Slave节点C | rocketmq_slave_03 | 172.16.150.136 |
2.2 软件环境要求
- 操作系统:CentOS 6.x/7.x系列
- JDK版本:JDK 1.8+
- RocketMQ版本:alibaba-rocketmq-3.2.2
2.3 系统初始化配置
# 1. 禁用防火墙(生产环境根据安全策略决定是否执行)
systemctl stop firewalld
systemctl disable firewalld
# 2. 关闭SELinux(生产环境按需处理)
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# 3. 配置时间同步(使用公网NTP服务器)
crontab -l
# 定时同步系统时间
*/5 * * * * /usr/bin/rdate -s time-b.nist.gov &>/dev/null
# 4. 安装配置Java运行环境
# 详情可参考JDK安装相关文档
# 5. 配置hosts文件(所有节点均需配置)
echo "172.16.150.131 rocketmq_master_01" >> /etc/hosts
echo "172.16.150.132 rocketmq_master_02" >> /etc/hosts
echo "172.16.150.133 rocketmq_master_03" >> /etc/hosts
echo "172.16.150.134 rocketmq_slave_01" >> /etc/hosts
echo "172.16.150.135 rocketmq_slave_02" >> /etc/hosts
echo "172.16.150.136 rocketmq_slave_03" >> /etc/hosts
三、集群部署步骤
3.1 安装包解压与目录初始化
# 在各节点执行以下操作
[root@rocketmq_master_01 ~]# tar xf alibaba-rocketmq-3.2.2.tar.gz -C /opt
[root@rocketmq_master_01 ~]# cd /opt && ln -sv alibaba-rocketmq-3.2.2 alibaba-rocketmq
[root@rocketmq_master_01 ~]# mkdir -p /var/alibaba-rocketmq/{commitlog,consumequeue,index,logs,namesrv}
[root@rocketmq_master_01 ~]# tree /var/alibaba-rocketmq/
[root@rocketmq_master_01 ~]# ll /opt/
3.2 配置文件目录结构
[root@rocketmq_master_01 ~]# cd /opt/alibaba-rocketmq/conf/
[root@rocketmq_master_01 conf]# ll
总用量 36
drwxr-xr-x 2 root root 118 2019-03-28 17:08 2m-2s-async # 双主双从异步模式配置
drwxr-xr-x 2 root root 118 2019-03-28 17:08 2m-2s-sync # 双主双从同步模式配置
drwxr-xr-x 2 root root 118 2019-03-28 17:08 2m-noslave # 双主模式配置
....
创建3主3从同步模式配置目录:
[root@rocketmq_master_01 conf]# mkdir 3m-3s-sync
[root@rocketmq_master_01 conf]# cd 3m-3s-sync/
3.3 Master节点配置文件
broker-_primary_a.properties - Master节点A配置
# 所属集群名称
brokerClusterName=prod_rocketmq-cluster
# Broker实例名称,同一集群内各节点需保持唯一
brokerName=broker-primary-a
# Broker角色标识:0表示Master,非0表示Slave
brokerId=0
# NameServer集群地址,多地址用分号分隔
namesrvAddr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
# Topic自动创建时默认队列数
defaultTopicQueueNums=4
# 是否允许自动创建Topic(生产环境建议关闭)
autoCreateTopicEnable=true
# 是否允许自动创建消费组(生产环境建议关闭)
autoCreateSubscriptionGroup=true
# Broker对外服务监听端口
listenPort=10911
# 定时清理文件时间点(每日凌晨4点)
deleteWhen=04
# 文件保留时长(单位:小时)
fileReservedTime=120
# CommitLog单个文件大小(默认1GB)
mapedFileSizeCommitLog=1073741824
# ConsumeQueue单个文件存储消息数量(默认30万条)
mapedFileSizeConsumeQueue=300000
# 磁盘空间使用率阈值
diskMaxUsedSpaceRatio=88
# 消息存储根目录
storePathRootDir=/var/alibaba-rocketmq/
# CommitLog存储路径
storePathCommitLog=/var/alibaba-rocketmq/commitlog
# 消费队列存储路径
storePathConsumeQueue=/var/alibaba-rocketmq/consumequeue
# 消息索引存储路径
storePathIndex=/var/alibaba-rocketmq/index
# 检查点文件存储路径
storeCheckpoint=/var/alibaba-rocketmq/checkpoint
# 中止文件存储路径
abortFile=/var/alibaba-rocketmq/abort
# Broker角色配置
# ASYNC_MASTER - 异步复制Master
# SYNC_MASTER - 同步双写Master
# SLAVE - 从节点
brokerRole=SYNC_MASTER
# 刷盘策略
# ASYNC_FLUSH - 异步刷盘
# SYNC_FLUSH - 同步刷盘
flushDiskType=SYNC_FLUSH
# 事务消息检查开关(开源版本3.2.2不支持事务消息)
checkTransactionMessageEnable=false
# 绑定本机IP地址
brokerIP1=172.16.150.131
# Netty工作线程数
serverWorkerThreads = 8
# Netty回调线程池线程数
serverCallbackExecutorThreads = 2
# Netty Selector线程数
serverSelectorThreads = 3
# 单向请求信号量大小
serverOnewaySemaphoreValue = 256
# 异步请求信号量大小
serverAsyncSemaphoreValue = 64
# 通道空闲检测间隔(单位:秒)
serverChannelMaxIdleTimeSeconds = 120
# Netty发送缓冲区大小
serverSocketSndBufSize = 65535
# Netty接收缓冲区大小
serverSocketRcvBufSize = 65535
# 是否启用Netty内存池
serverPooledByteBufAllocatorEnable = true
broker-primary_b.properties - Master节点B配置
brokerClusterName=prod_rocketmq-cluster
brokerName=broker-primary-b
brokerId=0
namesrvAddr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
defaultTopicQueueNums=4
autoCreateTopicEnable=true
autoCreateSubscriptionGroup=true
listenPort=10911
deleteWhen=04
fileReservedTime=120
mapedFileSizeCommitLog=1073741824
mapedFileSizeConsumeQueue=300000
diskMaxUsedSpaceRatio=88
storePathRootDir=/var/alibaba-rocketmq/
storePathCommitLog=/var/alibaba-rocketmq/commitlog
storePathConsumeQueue=/var/alibaba-rocketmq/consumequeue
storePathIndex=/var/alibaba-rocketmq/index
storeCheckpoint=/var/alibaba-rocketmq/checkpoint
abortFile=/var/alibaba-rocketmq/abort
brokerRole=SYNC_MASTER
flushDiskType=SYNC_FLUSH
checkTransactionMessageEnable=false
brokerIP1=172.16.150.132
serverWorkerThreads = 8
serverCallbackExecutorThreads = 2
serverSelectorThreads = 3
serverOnewaySemaphoreValue = 256
serverAsyncSemaphoreValue = 64
serverChannelMaxIdleTimeSeconds = 120
serverSocketSndBufSize = 65535
serverSocketRcvBufSize = 65535
serverPooledByteBufAllocatorEnable = true
broker-primary_c.properties - Master节点C配置
brokerClusterName=prod_rocketmq-cluster
brokerName=broker-primary-c
brokerId=0
namesrvAddr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
defaultTopicQueueNums=4
autoCreateTopicEnable=true
autoCreateSubscriptionGroup=true
listenPort=10911
deleteWhen=04
fileReservedTime=120
mapedFileSizeCommitLog=1073741824
mapedFileSizeConsumeQueue=300000
diskMaxUsedSpaceRatio=88
storePathRootDir=/var/alibaba-rocketmq/
storePathCommitLog=/var/alibaba-rocketmq/commitlog
storePathConsumeQueue=/var/alibaba-rocketmq/consumequeue
storePathIndex=/var/alibaba-rocketmq/index
storeCheckpoint=/var/alibaba-rocketmq/checkpoint
abortFile=/var/alibaba-rocketmq/abort
brokerRole=SYNC_MASTER
flushDiskType=SYNC_FLUSH
checkTransactionMessageEnable=false
brokerIP1=172.16.150.133
3.4 Slave节点配置文件
broker-secondary_a.properties - Slave节点A配置
brokerClusterName=prod_rocketmq-cluster
brokerName=broker-primary-a
brokerId=1
namesrvAddr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
defaultTopicQueueNums=4
autoCreateTopicEnable=true
autoCreateSubscriptionGroup=true
listenPort=10911
deleteWhen=04
fileReservedTime=120
mapedFileSizeCommitLog=1073741824
mapedFileSizeConsumeQueue=300000
diskMaxUsedSpaceRatio=88
storePathRootDir=/var/alibaba-rocketmq/
storePathCommitLog=/var/alibaba-rocketmq/commitlog
storePathConsumeQueue=/var/alibaba-rocketmq/consumequeue
storePathIndex=/var/alibaba-rocketmq/index
storeCheckpoint=/var/alibaba-rocketmq/checkpoint
abortFile=/var/alibaba-rocketmq/abort
brokerRole=SLAVE
flushDiskType=SYNC_FLUSH
checkTransactionMessageEnable=false
brokerIP1=172.16.150.134
broker-secondary_b.properties - Slave节点B配置
brokerClusterName=prod_rocketmq-cluster
brokerName=broker-primary-b
brokerId=1
namesrvAddr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
defaultTopicQueueNums=4
autoCreateTopicEnable=true
autoCreateSubscriptionGroup=true
listenPort=10911
deleteWhen=04
fileReservedTime=120
mapedFileSizeCommitLog=1073741824
mapedFileSizeConsumeQueue=300000
diskMaxUsedSpaceRatio=88
storePathRootDir=/var/alibaba-rocketmq/
storePathCommitLog=/var/alibaba-rocketmq/commitlog
storePathConsumeQueue=/var/alibaba-rocketmq/consumequeue
storePathIndex=/var/alibaba-rocketmq/index
storeCheckpoint=/var/alibaba-rocketmq/checkpoint
abortFile=/var/alibaba-rocketmq/abort
brokerRole=SLAVE
flushDiskType=SYNC_FLUSH
checkTransactionMessageEnable=false
brokerIP1=172.16.150.135
broker-secondary_c.properties - Slave节点C配置
brokerClusterName=prod_rocketmq-cluster
brokerName=broker-primary-c
brokerId=1
namesrvAddr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
defaultTopicQueueNums=4
autoCreateTopicEnable=true
autoCreateSubscriptionGroup=true
listenPort=10911
deleteWhen=04
fileReservedTime=120
mapedFileSizeCommitLog=1073741824
mapedFileSizeConsumeQueue=300000
diskMaxUsedSpaceRatio=88
storePathRootDir=/var/alibaba-rocketmq/
storePathCommitLog=/var/alibaba-rocketmq/commitlog
storePathConsumeQueue=/var/alibaba-rocketmq/consumequeue
storePathIndex=/var/alibaba-rocketmq/index
storeCheckpoint=/var/alibaba-rocketmq/checkpoint
abortFile=/var/alibaba-rocketmq/abort
brokerRole=SLAVE
flushDiskType=SYNC_FLUSH
checkTransactionMessageEnable=false
brokerIP1=172.16.150.136
3.5 关键配置项说明
| 配置项 | 说明 |
|---|---|
| brokerId | 0表示Master节点,大于0表示Slave节点 |
| brokerRole | ASYNC_MASTER/SYNC_MASTER/SLAVE |
| brokerIP1 | Broker服务绑定IP地址 |
| slaveReadEnable | 建议开启,默认为关闭状态 |
3.6 NameServer配置文件
[root@rocketmq_master_01 conf]# cat namesrv.properties
listenPort = 9876
serverWorkerThreads = 8
serverCallbackExecutorThreads = 2
serverSelectorThreads = 3
serverOnewaySemaphoreValue = 256
serverAsyncSemaphoreValue = 64
serverChannelMaxIdleTimeSeconds = 120
serverSocketSndBufSize = 65535
serverSocketRcvBufSize = 65535
serverPooledByteBufAllocatorEnable = true
kvConfigPath=/var/alibaba-rocketmq/namesrv/
3.7 日志与数据路径修改
[root@rocketmq_master_01 3m-3s-sync]# cd /opt/alibaba-rocketmq/conf/
[root@rocketmq_master_01 conf]# sed -i 's#${user.home}#/var/alibaba-rocketmq#g' *.xml
3.8 启动脚本参数调整
修改broker启动脚本runbroker.sh:
调整JVM堆内存参数,建议生产环境设置为:
JAVA_OPT="${JAVA_OPT} -server -Xms4g -Xmx4g -Xmn2g"
修改nameserver启动脚本runserver.sh:
同样调整JVM堆内存参数配置。
3.9 服务启动流程
步骤1:启动NameServer服务
[root@rocketmq_master_01 bin]# cd /opt/alibaba-rocketmq/bin
[root@rocketmq_master_01 bin]# nohup sh mqnamesrv &
[root@rocketmq_master_01 bin]# tailf /var/alibaba-rocketmq/logs/rocketmqlogs/namesrv.log
三台NameServer节点均需按上述步骤启动。
步骤2:启动Broker服务
# Master节点A启动
[root@rocketmq_master_01 bin]# nohup sh mqbroker -c /opt/alibaba-rocketmq/conf/3m-3s-sync/broker-primary_a.properties &
[root@rocketmq_master_01 bin]# tailf /var/alibaba-rocketmq/logs/rocketmqlogs/broker.log
# Master节点B启动
[root@rocketmq_master_02 bin]# nohup sh mqbroker -c /opt/alibaba-rocketmq/conf/3m-3s-sync/broker-primary_b.properties &
# Master节点C启动
[root@rocketmq_master_03 bin]# nohup sh mqbroker -c /opt/alibaba-rocketmq/conf/3m-3s-sync/broker-primary_c.properties &
# Slave节点A启动
[root@rocketmq_slave_01 bin]# nohup sh mqbroker -c /opt/alibaba-rocketmq/conf/3m-3s-sync/broker-secondary_a.properties &
# Slave节点B启动
[root@rocketmq_slave_02 bin]# nohup sh mqbroker -c /opt/alibaba-rocketmq/conf/3m-3s-sync/broker-secondary_b.properties &
# Slave节点C启动
[root@rocketmq_slave_03 bin]# nohup sh mqbroker -c /opt/alibaba-rocketmq/conf/3m-3s-sync/broker-secondary_c.properties &
四、监控管理平台部署
使用rocketmq-console进行集群可视化监控管理。由于当前RocketMQ版本较低(3.2.2),需使用兼容版本的控制台(rocketmq-console-3.2.6)。
4.1 部署步骤
# 1. 解压WAR包到指定目录
unzip rocketmq-console-3.2.6.war -d /opt/rocketmq-console/
# 2. 进入配置文件目录
cd /opt/rocketmq-console/WEB-INF/classes/
# 3. 修改NameServer地址配置
编辑config.properties配置文件,添加以下内容:
rocketmq.namesrv.addr=rocketmq_master_01:9876;rocketmq_master_02:9876;rocketmq_master_03:9876
# 4. 部署到Tomcat并启动
将目录移动至Tomcat的webapps/ROOT目录下,启动Tomcat服务
4.2 控制台功能
通过Web界面可直观查看集群状态,包括:
- Broker节点运行状态及详细信息
- Topic配置与消息统计
- 消费者组消费进度
- 消息生产与消费速率监控
五、运维注意事项
1. 配置文件命名需与节点角色对应:a对应Master1,b对应Master2,c对应Master3
2. 各节点brokerName配置必须保持一致,Master和Slave使用相同的brokerName
3. 每台服务器启动Broker时必须指定正确的配置文件路径
4. 生产环境建议使用screen工具管理服务进程,而非nohup方式
5. 建议启用slaveReadEnable=true配置项
6. 定期检查磁盘空间使用情况,避免磁盘写满导致服务异常
六、相关参考资料
- Apache RocketMQ官方文档:https://rocketmq.apache.org/docs/quick-start/
- RocketMQ Console项目:https://github.com/apache/rocketmq-externals/tree/release-rocketmq-console-1.0.0
- RocketMQ中文社区文档