一般情况下,MySQL/InnoDB都是运行在普通的PC Server + Linux(UNIX)上,虽然不期待它具有小型机+AIX 的高可用性,但想尽一切办法缩短MySQL的不可用时间仍然是DBA的目标。
根据经验,主机OS崩溃、硬件故障,仍然是影响MySQL可用性的最主要因素,如果这些故障都恢复了,另一个非常耗时的恢复就是InnoDB自己的恢复时间。
一般主机发生一次重启,正常大约需要小于5分钟的时间,但此时 InnoDB的恢复可能需要40分钟或者更久(这依赖于Buffer Pool、脏页面比例、TPS等因素)。试想,如果每次都能够把故障恢复时间控制在10分钟之内,那么通过应用容错、Cache支持等办法,用户体验和可用时间都将有进一步的提升。
在MySQL5.5版本里,通过算法和内存管理上的改进,做到了将故障恢复时间大大缩短,这也就意味着以后事务日志redo log最大可以设置为4 GB(xtraDB可以超过4 GB),这样就在很大程度上降低了I/O需求,刷脏页的频率降低了,从而极大地提高了InnoDB的写性能。
在MySQL5.5里,你无须任何操作即可实现快速恢复。下面,针对MySQL5.1.59 InnoDB和MySQL5.5 InnoDB Plugin 1.1.X,来一次破坏性实验,验证一下恢复时间。
实验前提:两台机器均为虚拟机,内存1 GB,两台机器的参数设置一致。如下所示:
innodb_log_file_size = 300M
innodb_log_files_in_group = 3
innodb_log_buffer_size = 16M
innodb_max_dirty_pages_pct = 75
innodb_force_recovery = 0
innodb_buffer_pool_size = 600M
innodb_flush_log_at_trx_commit = 0
下面开始测试。
首先来看看命令及参数,可以看到,其中有16个并发连接,最大请求1万个,表记录有9百万条。
Sysbench --test=oltp --MySQL-table-engine=innodb \
--oltp-table-size=9000000 \
--max-requests=10000 \
--num-threads=16 \
--MySQL-host=127.0.0.1 \
--MySQL-port=3306 \
--MySQL-user=root \
--MySQL-password=123456 \
--MySQL-db=test \
--MySQL-socket=/tmp/MySQL.sock prepare
接着另开一个终端,执行sleep 120;pkill -9 MySQLd,让其运行2分钟后,强杀MySQL进程,然后再启动MySQL,下面来看看恢复的过程:
MySQL5.1 恢复的过程:
120630 21:19:19 MySQLd_safe Starting MySQLd daemon with databases from /usr/local/MySQL/data
120630 21:19:19 [Note] Plugin 'FEDERATED' is disabled.
120630 21:19:19 InnoDB: Initializing buffer pool, size = 600.0M
120630 21:19:19 InnoDB: Completed initialization of buffer pool
InnoDB: Log scan progressed past the checkpoint lsn 0 337236631
120630 21:19:20 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 0 342479360
InnoDB: Doing recovery: scanned up to log sequence number 0 347722240
InnoDB: Doing recovery: scanned up to log sequence number 0 352965120
InnoDB: Doing recovery: scanned up to log sequence number 0 358208000
InnoDB: Doing recovery: scanned up to log sequence number 0 363450880
InnoDB: Doing recovery: scanned up to log sequence number 0 368693760
...
...
...
InnoDB: Doing recovery: scanned up to log sequence number 0 756666880
120630 21:20:03 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Doing recovery: scanned up to log sequence number 0 761909760
InnoDB: Doing recovery: scanned up to log sequence number 0 767152640
...
...
...
InnoDB: Doing recovery: scanned up to log sequence number 0 997839360
InnoDB: Doing recovery: scanned up to log sequence number 0 999899389
120630 21:21:56 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
120630 21:22:48 InnoDB: Started; log sequence number 0 999899389
120630 21:22:48 [Note] Event Scheduler: Loaded 0 events
120630 21:22:48 [Note] /usr/local/MySQL/bin/MySQLd: ready for connections.
Version: '5.1.59' socket: '/tmp/MySQL.sock' port: 3306 MySQL Community Server (GPL)
MySQL5.5 恢复的时间:
120630 21:45:39 MySQLd_safe Starting MySQLd daemon with databases from /usr/local/MySQL/data
120630 21:45:39 [Warning] You need to use --log-bin to make --binlog-format work.
120630 21:45:39 [Note] Plugin 'FEDERATED' is disabled.
120630 21:45:39 InnoDB: The InnoDB memory heap is disabled
120630 21:45:39 InnoDB: Mutexes and rw_locks use InnoDB's own implementation
120630 21:45:39 InnoDB: Compressed tables use zlib 1.2.3
120630 21:45:39 InnoDB: Using Linux native AIO
120630 21:45:39 InnoDB: Initializing buffer pool, size = 600.0M
120630 21:45:40 InnoDB: Completed initialization of buffer pool
120630 21:45:40 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 125985128
120630 21:45:41 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 131227648
InnoDB: Doing recovery: scanned up to log sequence number 136470528
InnoDB: Doing recovery: scanned up to log sequence number 141713408
InnoDB: Doing recovery: scanned up to log sequence number 146956288
...
...
...
InnoDB: Doing recovery: scanned up to log sequence number 540172288
InnoDB: Doing recovery: scanned up to log sequence number 545415168
120630 21:46:03 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Doing recovery: scanned up to log sequence number 550658048
InnoDB: Doing recovery: scanned up to log sequence number 555900928
...
...
...
InnoDB: Doing recovery: scanned up to log sequence number 755130368
InnoDB: Doing recovery: scanned up to log sequence number 755420180
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 7621 row operations to undo
InnoDB: Trx id counter is 600
120630 21:47:10 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Starting in background the rollback of uncommitted transactions
120630 21:47:21 InnoDB: Rolling back trx with id 424, 7621 rows to undo
InnoDB: Progress in percents: 1120630 21:47:21 InnoDB: Waiting for the background threads to start
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
InnoDB: Rolling back of trx id 424 completed
120630 21:47:22 InnoDB: Rollback of non-prepared transactions completed
120630 21:47:22 InnoDB: 1.1.8 started; log sequence number 755420180
120630 21:47:23 [Note] Event Scheduler: Loaded 0 events
120630 21:47:23 [Note] /usr/local/MySQL/bin/MySQLd: ready for connections.
Version: '5.5.19' socket: '/tmp/MySQL.sock' port: 3306 MySQL Community Server (GPL)
再来看看两个版本各自的恢复时间:
MySQL5.1 恢复时间 MySQL5.5 恢复时间
3分28秒 1分44秒
从上面的结果来看,MySQL5.5恢复的时间要比MySQL5.1快两倍多。当然,这个结果涉及的因素有很多,比如,Buffer Pool、脏页面比例、TPS等,实际操作时,需要根据自己的情况来测试。
关于加快InnoDB的恢复时间,请参见MySQL5.5手册: