One thing to correct: the hang is not forever - after I posted the previous email, all commands returns and the array stopped. It takes around 40 minutes -- still quite unexpected and suspicious. Thanks. Coly Li On 2020/9/12 22:06, Coly Li wrote: > Unexpected Behavior: > - With Linux v5.9-rc4 mainline kernel and latest mdadm upstream code > - After running fio with 10 jobs, 16 iodpes and 64K block size for a > while, try to stop the fio process by 'Ctrl + c', the main fio process > hangs. > - Then try to stop the md raid 5 array by 'mdadm -S /dev/md0', the mdad > process hangs. > - Reboot the system by 'echo b > /proc/sysrq-trigger', this md raid5 > array is assembled but inactive. /proc/mdstat shows, > Personalities : [raid6] [raid5] [raid4] > md127 : inactive sdc[0] sde[3] sdd[1] > 35156259840 blocks super 1.2 > > Expectation: > - The fio process can stop with 'Ctrl + c' > - The raid5 array can be stopped by 'mdadm -S /dev/md0' > - This md raid5 array may continue to work (resync and being active) > after reboot > > > How to reproduce: > 1) Create md raid5 with 3 hard drives (12TB for each SATA spinning disk) > # mdadm -C /dev/md0 -l 5 -n 3 /dev/sd{c,d,e} > # cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] > md0 : active raid5 sde[3] sdd[1] sdc[0] > 23437506560 blocks super 1.2 level 5, 512k chunk, algorithm 2 > [3/2] [UU_] > [>....................] recovery = 0.0% (2556792/11718753280) > finish=5765844.7min speed=33K/sec > bitmap: 2/88 pages [8KB], 65536KB chunk > > 2) Run fio for random write on the raid5 array > fio job file content: > [global] > thread=1 > ioengine=libaio > random_generator=tausworthe64 > > [job] > filename=/dev/md0 > readwrite=randwrite > blocksize=64K > numjobs=10 > iodepth=16 > runtime=1m > # fio ./raid5.fio > > 3) Wait for 10 seconds after the above fio runs, then type 'Ctrl + c' to > stop the fio process: > x:/home/colyli/fio_test/raid5 # fio ./raid5.fio > job: (g=0): rw=randwrite, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, > (T) 64.0KiB-64.0KiB, ioengine=libaio, iodepth=16 > ... > fio-3.23-10-ge007 > Starting 12 threads > ^Cbs: 12 (f=12): [w(12)][3.3%][w=6080KiB/s][w=95 IOPS][eta 14m:30s] > fio: terminating on signal 2 > ^C > fio: terminating on signal 2 > ^C > fio: terminating on signal 2 > Jobs: 11 (f=11): [w(5),_(1),w(4),f(1),w(1)][7.5%][eta 14m:20s] > ^C > fio: terminating on signal 2 > Jobs: 11 (f=11): [w(5),_(1),w(4),f(1),w(1)][70.5%][eta 15m:00s] > > Now the fio process is hang forever. > > 4) try to stop this md raid5 array by mdadm > # mdadm -S /dev/md0 > Now the mdadm process hangs for ever > > > Kernel versions to reproduce > - Use latest upstream mdadm source code > - I tried Linux v5.9-rc4, and Linux v4.12, both of them may stable > reproduce the above unexpected behavior. > Therefore I assume maybe at least from v4.12 to v5.9 may have such issue. > > Just for your information, hope you may have a look into it. Thanks in > advance. > > Coly Li >