MySQL 主备(Master-Slave Replication)数据库搭建和相关操作
创建集群
mysql -h$MYSQL_IP -u$MYSQL_USER -p"$MYSQL_PASS" -e "
GRANT REPLICATION SLAVE ON *.* to 'repl'@'$slave' IDENTIFIED BY '$MYSQL_PASS';
flush privileges;
# flush tables with read lock";
获取 master
节点File($binlog_file)
和 Position($bin_pos)
位置
备份主库
mysqldump -h$MYSQL_IP -u$MYSQL_USER -p"$MYSQL_PASS" --opt --all-databases > /tmp/db_1.sql
导入从库
mysql -h$MYSQL_IP1 -u$MYSQL_USER -p"$MYSQL_PASS" < /tmp/db_1.sql
创建集群:
mysql -h$MYSQL_IP1 -u$MYSQL_USER -p"$MYSQL_PASS" -e "
unlock tables;
CHANGE MASTER TO MASTER_HOST='$MYSQL_IP',
MASTER_PORT=3306,
MASTER_USER='repl',
MASTER_PASSWORD='$MYSQL_PASS',
MASTER_LOG_FILE='$binlog_file',
MASTER_LOG_POS=$bin_pos,
MASTER_CONNECT_RETRY=60;
start slave;"
MASTER_LOG_FILE
:指定 Slave 从哪个日志文件开始复制数据
MASTER_LOG_POS
:从哪个 Position 开始读
MASTER_CONNECT_RETRY
:如果连接失败,重试的时间间隔,单位是秒,默认是 60 秒
备节点文件说明:
master.info
记录了上一次读取到 master 同步过来的 binlog 的位置,以及连接 master 和启动复制必须的所有信息
relay-log.info
记录了文件复制的进度,下一个事件从什么位置开始,由 sql 线程负责更新
命令
查看主备状态
show master status\G;
show slave status\G;
show slave status
的参数说明:
Relay_Master_Log_File
:已经同步到主的哪个 binlog 文件
Exec_Master_Log_Pos
:已经同步到 binlog 文件的哪个位置
Slave_SQL_Running
:备节点的 SQL 进程,YES 表是正常
Slave_IO_Running
:备节点的 IO 进程,YES 表是正常
查看主节点 binlog 日志
show master logs;
检查主从同步
- master 节点
show master status\G;
Position 不应该为 0
- master 节点
show processlist;
STATE 状态应该为 Master has sent all binlog to slave; waiting for more updates
mysql> SELECT * FROM information_schema.processlist where user = "repl"\G;
*************************** 1. row ***************************
ID: 22
USER: repl
HOST: 172.20.20.21:31919
DB: NULL
COMMAND: Binlog Dump
TIME: 1791141
STATE: Master has sent all binlog to slave; waiting for more updates
INFO: NULL
1 row in set (0.00 sec)
- 确认如下二对参数值是否一致,如果不一致,说明主从库已经挂了,需要先同步数据:
mysql > show slave status\G;
Master_Log_File: Relay_Master_Log_File # binlog file
Read_Master_Log_Pos: Exec_Master_Log_Pos # 同步指针
- slave 节点 Slave_IO_Running 与 Slave_SQL_Running 状态都要为 Yes
mysql> show slave status\G;
*************************** 1. row ***************************
...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
...
- slave 节点
show processlist;
,存在如下 2 行:
mysql> show processlist\G;
*************************** 1. row ***************************
Id: 1
User: system user
Host:
db: NULL
Command: Connect
Time: 50
State: Slave has read all relay log; waiting for more updates
Info: NULL
*************************** 2. row ***************************
Id: 2
User: system user
Host:
db: NULL
Command: Connect
Time: 1792327
State: Waiting for master to send event
Info: NULL
从库异常恢复
基于 xtrabackup 恢复使用位置点的从节点
- xtrabackup: 备份 MySQL 数据库
- 如果
xtrabackup_binlog_info
没有 GTID 信息,则代表备份实例没有开启 GTID,这个时候就无需设置 GTID_PURGED,直接执行 CHANGE MASTER TO 命令
CHANGE MASTER TO
MASTER_HOST='<master-ip>',
MASTER_USER='repl',
MASTER_PASSWORD='<password>',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000xxx',
MASTER_LOG_POS=xxx;
同步异常
如果从库的 Slave 未启动,Slave_IO_Running
为NO
。可能是主库是的master
的信息有变化,查看主库show master status;
,记录下File
、Position
字段,假设为mysql-bin.xxxx
、xxxx
,在从库执行:
mysql> stop slave;
mysql> change master to master_host='ip', master_user='username', master_password='password', master_log_file='mysql-bin.xxxx', master_log_pos=xxxx;
mysql> start slave;
Last_IO_Errno: 1032
stop slave;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
start slave;
Last_IO_Errno: 1593
如果从库的Slave_IO_Running
为NO
,Last_SQL_Error
为:
Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).
在搭建主从复制的过程中,要确保两台机器的 server-id
是唯一的,server-id
的命名规则(服务器 ip 地址的最后一位+本 MySQL 服务的端口号),解决方法:
mysql> stop slave;
mysql> reset slave;
mysql> CHANGE MASTER TO MASTER_HOST='<master-ip>', MASTER_USER='<user>', MASTER_PASSWORD='<password>', master_log_file='mysql-bin.xxxx', master_log_pos=xxxx;
mysql> flush privileges;
mysql> start slave;
Last_SQL_Errno: 1062
如果从库的Slave_IO_Running
为NO
,Last_SQL_Error
为:
Error 'Duplicate entry '60314' for key 'PRIMARY'' on query. ...
可能是 master 未向 slave 同步成功,但 slave 中已经有了记录。造成的冲突可以在从库上执行:
mysql> stop slave;
mysql> set GLOBAL SQL_SLAVE_SKIP_COUNTER=1000;
mysql> start slave;
或
mysql -h$MYSQL_IP1 -u$MYSQL_USER -p"$MYSQL_PASS" -e "
stop slave;
set GLOBAL SQL_SLAVE_SKIP_COUNTER=1000;
start slave;"
SQL_SLAVE_SKIP_COUNTER
指的是数量,可以自行加大。跳过几步。再 restart slave 就可以了。
Last_Errno: 1677
Last_Errno: 1677
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log mysql-bin.xxx, end_log_pos xxx. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
- 解决方式:
ALL_NON_LOSSY(无损转换)
模式下,不会导致数据丢失和截断,因为该值只允许同类的小数据类型转换为大数据类型
stop slave;
set global slave_type_conversions=ALL_NON_LOSSY;
start slave;
在线开启 GTID
从基于 binlog 文件名及位点的方式,切换到基于 GTID(全局事务 ID) 的复制模式,相关参数:
- GTID_MODE 支持的值(参考)
OFF
值 0,只允许匿名事务被复制同步
OFF_PERMISSIVE
值 1,新产生的事务都是匿名事务,但也允许有 GTID 事务被复制同步
ON_PERMISSIVE
值 2,新产生的都是 GTID 事务,但也允许有匿名事务被复制同步
ON
值 3,只允许 GTID 事务被复制同步
# 在主从库均执行
SET @@GLOBAL.ENFORCE_GTID_CONSISTENCY = WARN;
# 在主从库均执行
SET @@GLOBAL.ENFORCE_GTID_CONSISTENCY = ON;
# 在主从库均执行
SET @@GLOBAL.GTID_MODE = OFF_PERMISSIVE;
# 在主从库均执行
SET @@GLOBAL.GTID_MODE = ON_PERMISSIVE;
# 确保该状态值输出的匿名事务数显示为0:仅在从库执行
SHOW STATUS LIKE 'ONGOING_ANONYMOUS_TRANSACTION_COUNT';
# or 查询视图中 LAST_SEEN_TRANSACTION 可以观测当前同步的事务是否还存在 ANONYMOUS 事务
select * from performance_schema.replication_applier_status_by_worker;
# 触发一轮日志切换,该操作仅在主库执行即可
FLUSH LOGS;
# 在主从库均执行
SET @@GLOBAL.GTID_MODE = ON;
# 查看均为 ON
SELECT @@GTID_MODE,@@ENFORCE_GTID_CONSISTENCY;
[mysqld]
gtid-mode = ON
enforce-gtid-consistency = ON
# 停止复制
STOP SLAVE;
# 修改为GTID模式
CHANGE MASTER TO MASTER_HOST='<master-ip>', MASTER_USER='<user>', MASTER_PASSWORD='<password>', MASTER_AUTO_POSITION = 1;
# 开启复制
START SLAVE;
# 观测复制同步状态
SHOW SLAVE STATUS\G
F&Q
事务级别会影响 MySQL binlog 的格式
In MySQL 5.7, when READ COMMITTED isolation level is used, or the deprecated innodb_locks_unsafe_for_binlog system variable is enabled,
there is no InnoDB gap locking except for foreign-key constraint checking and duplicate-key checking. Also, record locks for nonmatching
rows are released after MySQL has evaluated the WHERE condition.
If you use READ COMMITTED or enable innodb_locks_unsafe_for_binlog, you must use row-based binary logging.
主从 my.cnf 配置
[mysqld]
datadir=/var/lib/mysql/
socket=/var/lib/mysql//mysql.sock
symbolic-links=0
server_id=31 # 服务器ID,主从需要不一致
log-bin=mysql-bin # 二进制日志文件名
binlog_format = row # 强烈建议,其他格式可能造成数据不一致
log-slave-updates = 1 # 是否记录从服务器同步数据动作
gtid-mode = on # 启用gitd功能
enforce-gtid-consistency = 1 # 开启强制GTID一致性
master-info-repository = TABLE # 记录IO线程读取已经读取到的master binlog位置,用于slave宕机后IO线程根据文件中的POS点重新拉取binlog日志
relay-log-info-repository = TABLE # 记录SQL线程读取Master binlog的位置,用于Slave 宕机后根据文件中记录的pos点恢复Sql线程
sync-master-info = 1 # 启用确保无信息丢失;任何一个事务提交后, 将二进制日志的文件名及事件位置记录到文件中
slave-parallel-workers = 2 # 设定从服务器的复制线程数;0表示关闭多线程复制功能
binlog-checksum = CRC32 # 设置binlog校验算法(循环冗余校验码)
master-verify-checksum = 1 # 设置主服务器是否校验
slave-sql-verify-checksum = 1 # 设置从服务器是否校验
binlog-rows-query-log_events = 1 # 用于在二进制日志记录事件相关的信息,可降低故障排除的复杂度
sync_binlog = 1 # 保证master crash safe,该参数必须设置为1
innodb_flush_log_at_trx_commit = 1 # 保证master crash safe,该参数必须设置为1
innodb_file_per_table = 1
# read_only=on # 设置为只读模式,从节点配置
cat /etc/my.cnf
[client]
port=3306
socket=/data/run/mysql.sock
[mysqld]
port=3306
socket=/data/run/mysql.sock
pid_file=/data/run/mysql.pid
datadir=/data/data
default_storage_engine=InnoDB
max_allowed_packet=512M
max_connections=2048
open_files_limit=65535
skip-name-resolve
lower_case_table_names=1
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4'
innodb_buffer_pool_size=1024M
innodb_log_file_size=2048M
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=0
key_buffer_size=64M
log-error=/data/log/mysql_error.log
log-bin=/data/binlogs/mysql-bin
slow_query_log=1
slow_query_log_file=/data/log/mysql_slow_query.log
long_query_time=5
tmp_table_size=32M
max_heap_table_size=32M
query_cache_type=0
query_cache_size=0
server-id=1
总结
- 主键冲突、表已存在等错误代码如 1062、1032、1060 等,可以在 mysql 主配置文件指定略过此类异常并继续下条 sql 同步,这样也可以避免很多主从同步的异常中断
[mysqld]
slave-skip-errors = 1032,1060,1062