Hi, I am trying to use NILFS2 to make MySQL cold backup. I run many MySQL slave servers on the machine, and store the data on NILFS2 filesystem. Most of the engine of the table is MyISAM, few InnoDB. When the issue occurred, some MySQL is running, and I am copying data to MySQL data dir with rsync. The util of the NILFS2 partition is almost 100%. In addition to this problem, I also encountered some other problems. Some MyISAM tables suddenly be crashed, then sql_thread of slave stopped. But I do not need to repair the table, just wait a bit, and then restart the sql_thread, can continue. So I guess that may be a problem with the file system. Below is the error log of mysql about this: 121225 14:38:03 [ERROR] Slave SQL: Error 'Table 'consume_log_2a' is marked as crashed and should be repaired' on query. Default database: 'app_wsgrr'. Query: 'DELETE FROM consume_log_2a WHERE log_time<=1353750526 AND coin_type !=2 AND coin_type!=12 AND coin_type !=13', Error_code: 1194 # table crashed 121225 14:38:03 [Warning] Slave: Table 'consume_log_2a' is marked as crashed and should be repaired Error_code: 1194 121225 14:38:03 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'yf75-bin.000296' position 680291890 121225 14:58:54 [Note] Slave I/O thread exiting, read up to log 'yf75-bin.000298', position 802588261 # restart the sql_thread without repairing table 121225 14:58:55 [Note] Slave I/O thread: connected to master 'replica@10.75.7.75:6011',replication started in log 'yf75-bin.000298' at position 802588261 Version of MySQL Server: Percona Server 5.5.23 sdb2 is the NILFS2 partition. Below is the result of "iostat -xm -p sdb 1" in last few seconds. Linux 2.6.32-220.13.1.el6.x86_64 (yf237) 12/25/2012 _x86_64_ (8 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 1.24 0.00 1.91 4.02 0.00 92.83 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 3285.75 42.27 259.02 151.13 15.14 18.46 167.73 5.54 13.52 0.81 33.26 sdb1 0.01 42.26 0.37 1.98 0.04 0.17 183.98 0.06 26.61 2.38 0.56 sdb2 3285.74 0.00 258.65 148.65 15.10 18.28 167.85 5.48 13.46 0.82 33.25 avg-cpu: %user %nice %system %iowait %steal %idle 1.23 0.00 2.22 16.54 0.00 80.00 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 251.00 0.00 736.00 81.00 4.62 8.87 33.81 19.26 23.74 1.22 99.90 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 251.00 0.00 736.00 79.00 4.62 8.87 33.89 19.27 23.79 1.23 99.90 avg-cpu: %user %nice %system %iowait %steal %idle 4.80 0.00 1.90 20.99 0.00 72.31 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 300.00 450.00 900.00 87.00 10.64 9.50 41.80 19.50 19.55 1.01 99.90 sdb1 0.00 450.00 1.00 16.00 0.12 1.82 234.35 0.32 18.88 6.76 11.50 sdb2 300.00 0.00 899.00 69.00 10.52 7.68 38.50 19.18 19.60 1.03 99.90 avg-cpu: %user %nice %system %iowait %steal %idle 2.12 0.00 2.25 21.60 0.00 74.03 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 449.00 0.00 625.00 81.00 5.00 9.18 41.12 15.32 21.36 1.41 99.40 sdb1 0.00 0.00 1.00 0.00 0.12 0.00 256.00 0.02 19.00 19.00 1.90 sdb2 449.00 0.00 624.00 79.00 4.88 9.18 40.93 15.30 21.42 1.41 99.40 avg-cpu: %user %nice %system %iowait %steal %idle 0.51 0.00 1.52 15.08 0.00 82.89 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 9243.00 0.00 720.00 83.00 23.93 8.20 81.94 19.40 21.53 1.25 100.00 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 9341.00 0.00 722.00 79.00 24.18 8.20 82.79 19.39 21.65 1.25 99.90 avg-cpu: %user %nice %system %iowait %steal %idle 1.23 0.00 6.67 11.36 0.00 80.74 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 10607.00 25.00 687.00 718.00 59.84 86.66 213.54 26.66 20.83 0.66 92.20 sdb1 0.00 24.00 0.00 9.00 0.00 0.13 29.33 0.00 0.11 0.11 0.10 sdb2 10509.00 1.00 685.00 705.00 59.59 86.53 215.29 26.66 21.02 0.66 92.30 avg-cpu: %user %nice %system %iowait %steal %idle 0.86 0.00 1.84 16.67 0.00 80.64 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 127.00 0.00 458.00 139.00 3.05 13.29 56.07 9.15 15.29 1.66 99.30 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 127.00 0.00 458.00 137.00 3.05 13.29 56.26 9.15 15.35 1.67 99.20 avg-cpu: %user %nice %system %iowait %steal %idle 0.50 0.00 1.38 13.05 0.00 85.07 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 111.00 229.00 523.00 151.00 3.28 10.09 40.61 12.50 18.36 1.47 99.00 sdb1 0.00 229.00 2.00 54.00 0.25 1.11 49.57 0.33 5.82 1.14 6.40 sdb2 111.00 0.00 521.00 89.00 3.03 8.98 40.31 12.17 19.75 1.62 99.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.88 0.00 1.26 14.99 0.00 82.87 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 171.00 0.00 539.00 86.00 3.29 7.48 35.28 10.23 16.33 1.58 98.80 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 171.00 0.00 540.00 78.00 3.29 7.48 35.69 10.23 16.55 1.60 98.80 avg-cpu: %user %nice %system %iowait %steal %idle 0.76 0.00 1.90 17.62 0.00 79.72 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 156.00 0.00 536.00 119.00 3.31 11.17 45.26 12.63 18.86 1.51 99.20 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 156.00 0.00 535.00 114.00 3.30 11.17 45.67 12.63 19.00 1.53 99.20 avg-cpu: %user %nice %system %iowait %steal %idle 0.49 0.00 1.10 14.81 0.00 83.60 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 169.00 23.00 552.00 94.00 3.36 9.02 39.24 15.50 24.51 1.54 99.80 sdb1 0.00 23.00 0.00 6.00 0.00 0.11 38.67 0.00 0.00 0.00 0.00 sdb2 169.00 0.00 552.00 82.00 3.36 8.90 39.62 15.50 24.97 1.57 99.80 avg-cpu: %user %nice %system %iowait %steal %idle 0.75 0.00 1.25 14.91 0.00 83.08 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 140.00 0.00 589.00 104.00 3.61 9.41 38.50 12.04 17.35 1.43 99.20 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 140.00 0.00 589.00 103.00 3.61 9.41 38.55 12.04 17.38 1.43 99.20 avg-cpu: %user %nice %system %iowait %steal %idle 0.63 0.00 1.14 14.68 0.00 83.54 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 174.00 323.00 599.00 112.00 3.97 10.52 41.73 12.44 17.52 1.39 98.60 sdb1 0.00 323.00 0.00 11.00 0.00 1.30 242.91 0.11 9.82 1.00 1.10 sdb2 174.00 0.00 601.00 93.00 3.98 9.21 38.93 12.33 17.84 1.42 98.60 avg-cpu: %user %nice %system %iowait %steal %idle 0.49 0.00 1.11 15.23 0.00 83.17 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 95.00 0.00 750.00 65.00 3.91 6.57 26.34 20.28 24.46 1.22 99.50 sdb1 0.00 0.00 1.00 0.00 0.12 0.00 256.00 0.07 71.00 71.00 7.10 sdb2 95.00 0.00 747.00 60.00 3.78 6.57 26.26 20.20 24.56 1.23 99.50 avg-cpu: %user %nice %system %iowait %steal %idle 0.50 0.00 1.37 13.04 0.00 85.09 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 159.00 0.00 557.00 96.00 3.23 8.82 37.81 16.73 26.08 1.52 99.40 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 159.00 0.00 557.00 92.00 3.23 8.82 38.04 16.73 26.24 1.53 99.40 avg-cpu: %user %nice %system %iowait %steal %idle 0.51 0.00 1.16 14.51 0.00 83.83 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 139.00 0.00 637.00 79.00 3.73 7.44 31.96 14.28 20.11 1.39 99.80 sdb1 0.00 0.00 2.00 0.00 0.25 0.00 256.00 0.04 22.00 22.00 4.40 sdb2 139.00 0.00 635.00 67.00 3.48 7.44 31.86 14.24 20.45 1.42 99.80 avg-cpu: %user %nice %system %iowait %steal %idle 0.98 0.00 1.60 15.83 0.00 81.60 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 198.00 15.00 566.00 109.00 3.79 10.37 42.96 14.10 20.60 1.48 99.60 sdb1 0.00 15.00 1.00 6.00 0.12 0.08 60.57 0.02 2.71 2.71 1.90 sdb2 198.00 0.00 566.00 97.00 3.67 10.29 43.11 14.08 20.97 1.50 99.60 avg-cpu: %user %nice %system %iowait %steal %idle 1.11 0.00 1.35 17.32 0.00 80.22 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdb 168.00 257.00 613.00 99.00 3.60 9.41 37.43 15.08 21.11 1.39 99.10 sdb1 0.00 257.00 0.00 10.00 0.00 1.04 213.60 0.26 26.00 2.60 2.60 sdb2 168.00 0.00 612.00 86.00 3.59 8.37 35.11 14.82 21.13 1.42 99.10 Below is a snapshot of iotop: Total DISK READ: 3.26 M/s | Total DISK WRITE: 66.36 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 32291 be/4 my6013 1854.71 K/s 77.93 K/s 0.00 % 84.47 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 32038 be/4 my6005 472.77 K/s 233.79 K/s 0.00 % 38.31 % mysqld --defaults-file=/data0/mysql6005/my6005.cnf 27470 be/4 my6015 316.91 K/s 0.00 B/s 0.00 % 26.11 % mysqld --defaults-file=/data0/mysql6015/my6015.cnf 14478 be/4 my6010 124.69 K/s 223.40 K/s 0.00 % 19.95 % mysqld --defaults-file=/data0/mysql6010/my6010.cnf 32131 be/4 my6007 363.67 K/s 264.96 K/s 0.00 % 16.28 % mysqld --defaults-file=/data0/mysql6007/my6007.cnf 11578 be/4 my6018 31.17 K/s 353.28 K/s 0.00 % 14.17 % mysqld --defaults-file=/data0/mysql6018/my6018.cnf 27469 be/4 my6015 5.20 K/s 15.59 K/s 0.00 % 12.47 % mysqld --defaults-file=/data0/mysql6015/my6015.cnf 25104 be/4 my6009 15.59 K/s 161.05 K/s 0.00 % 9.47 % mysqld --defaults-file=/data0/mysql6009/my6009.cnf 7144 be/4 root 41.56 K/s 5.82 M/s 0.00 % 8.41 % [segctord] 11498 be/4 my6018 46.76 K/s 67.54 K/s 0.00 % 7.50 % mysqld --defaults-file=/data0/mysql6018/my6018.cnf 1307 be/4 my6016 5.20 K/s 140.27 K/s 0.00 % 4.83 % mysqld --defaults-file=/data0/mysql6016/my6016.cnf 11481 be/4 my6018 20.78 K/s 0.00 B/s 0.00 % 3.89 % mysqld --defaults-file=/data0/mysql6018/my6018.cnf 13831 be/4 my6003 5.20 K/s 181.83 K/s 0.00 % 0.77 % mysqld --defaults-file=/data0/mysql6003/my6003.cnf 973 be/4 root 0.00 B/s 46.76 K/s 0.00 % 0.03 % [kjournald] 972 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.02 % [kjournald] 18568 be/4 my6016 0.00 B/s 93.51 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6016/my6016.cnf 18569 be/4 my6016 0.00 B/s 207.81 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6016/my6016.cnf 14477 be/4 my6010 0.00 B/s 109.10 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6010/my6010.cnf 32130 be/4 my6007 0.00 B/s 51.95 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6007/my6007.cnf 12656 be/4 www 0.00 B/s 1449.48 K/s 0.00 % 0.00 % rsync --daemon 25103 be/4 my6009 0.00 B/s 31.17 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6009/my6009.cnf 962 be/4 my6013 0.00 B/s 353.28 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 963 be/4 my6013 0.00 B/s 327.30 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 964 be/4 my6013 0.00 B/s 135.08 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 965 be/4 my6013 0.00 B/s 290.94 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 7145 be/4 root 0.00 B/s 446.79 K/s 0.00 % 0.00 % nilfs_cleanerd -n /dev/sdb2 /data0/ 13830 be/4 my6003 0.00 B/s 62.34 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6003/my6003.cnf 27723 be/4 my6015 0.00 B/s 244.18 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6015/my6015.cnf 27722 be/4 my6015 0.00 B/s 150.66 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6015/my6015.cnf 11577 be/4 my6018 0.00 B/s 124.69 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6018/my6018.cnf 32193 be/4 my6011 0.00 B/s 98.71 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6011/my6011.cnf 32240 be/4 my6012 0.00 B/s 93.51 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6012/my6012.cnf 11803 be/4 my6002 0.00 B/s 15.59 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6002/my6002.cnf 11804 be/4 my6002 0.00 B/s 25.98 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6002/my6002.cnf 32290 be/4 my6013 0.00 B/s 140.27 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 32352 be/4 my6014 0.00 B/s 25.98 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6014/my6014.cnf 32037 be/4 my6005 0.00 B/s 150.66 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6005/my6005.cnf 984 be/4 my6013 0.00 B/s 1329.99 K/s 0.00 % 0.00 % mysqld --defaults-file=/data0/mysql6013/my6013.cnf 在 2012-12-25,14:02,Vyacheslav Dubeyko <slava@xxxxxxxxxxx> 写道: > Hi, > > On Thu, 2012-12-20 at 19:02 +0800, 张 磊 wrote: >> Yes, I mounted NILFS2 as read-write. It's remounted as read-only by kernel when filesystem found the bad btree node. >> >> That's the full backtrace. I will keep on testing, and report more infomation once I found. >> > > I am trying to reproduce the issue but currently without any success. I > have a presupposition that it can be a synchronization issue between GC > and main driver logic but I haven't any evidence of it yet. Probably, I > can't reproduce some environment's peculiarities. > > So, I think that I need to understand more deeply a workload in that the > issue had occurred. As I remember, you talked about several MySQL > databases and so on. Could you describe in more details about what > applications and how to work before issue occurrence? > > Thanks, > Vyacheslav Dubeyko. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html