Hi all, It doesn't seem like my response from last night, made it to the list: Hi Shaohua and all, I was finally able to upgrade my Ubuntu server to a newer version of the kernel and mdadm: ========================== $ uname -r; mdadm -V 4.8.0-rc7-custom mdadm - v3.4 - 28th January 2016 ========================== I rebuilt the kernel with the options that Shaohua asked me to build it with: ====================================== diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 5883ef0..db484ca 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -62,6 +62,9 @@ #include "raid0.h" #include "bitmap.h" +#undef pr_debug +#define pr_debug trace_printk + #define cpu_to_group(cpu) cpu_to_node(cpu) #define ANY_GROUP NUMA_NO_NODE ====================================== Here's how things look so far, nothing different yet: ====================================== $ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : inactive sdd[10](S) sdk[0](S) sdj[2](S) sdh[3](S) sde[11](S) sdg[9](S) sdf[7](S) sdb[13](S) sdc[12](S) 26371219608 blocks super 1.2 unused devices: <none> ====================================== Here's an event snapshot of my array, just keep in mind that "/dev/sdi" is my failed drive, so I omitted it from the examination: ====================================== # mdadm -E /dev/sd[b-h,j,k] |grep Events Events : 280033 Events : 280033 Events : 280033 Events : 280033 Events : 280033 Events : 280033 Events : 280011 Events : 280033 Events : 280033 ====================================== It's important to note, that since I haven't done anything yet, my CPU is idle: ====================================== top - 20:22:00 up 5:56, 2 users, load average: 0.04, 0.03, 0.00 Tasks: 221 total, 1 running, 220 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.0 us, 1.0 sy, 0.0 ni, 97.5 id, 0.5 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 1525400 total, 103836 free, 696208 used, 725356 buff/cache KiB Swap: 25153532 total, 25117380 free, 36152 used. 454808 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2093 cisco 20 0 1761112 153108 55640 S 1.0 10.0 0:12.61 gnome-shell 4322 root 20 0 40520 3684 3100 R 1.0 0.2 0:00.22 top 1 root 20 0 119692 5540 3992 S 0.0 0.4 0:02.44 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0: ====================================== Now onto the fun part. I stopped "/dev/md127": ====================================== # mdadm --stop /dev/md127 mdadm: stopped /dev/md127 # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] unused devices: <none> ====================================== For completion, here's the trace output after stopping the array, and before reassembling: ====================================== # tracer: nop # # entries-in-buffer/entries-written: 0/0 #P:2 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | /sys/kernel/debug/tracing/trace (END) ====================================== Then I reassembled the array: ====================================== # mdadm -Afv /dev/md127 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdj /dev/sdk mdadm: looking for devices for /dev/md127 mdadm: /dev/sdb is busy - skipping mdadm: Merging with already-assembled /dev/md/en1 mdadm: /dev/sdb is identified as a member of /dev/md/en1, slot 7. mdadm: /dev/sdc is identified as a member of /dev/md/en1, slot 8. mdadm: /dev/sdd is identified as a member of /dev/md/en1, slot 6. mdadm: /dev/sde is identified as a member of /dev/md/en1, slot 9. mdadm: /dev/sdf is identified as a member of /dev/md/en1, slot 4. mdadm: /dev/sdg is identified as a member of /dev/md/en1, slot 1. mdadm: /dev/sdh is identified as a member of /dev/md/en1, slot 3. mdadm: /dev/sdj is identified as a member of /dev/md/en1, slot 2. mdadm: /dev/sdk is identified as a member of /dev/md/en1, slot 0. mdadm: Marking array /dev/md/en1 as 'clean' mdadm: /dev/md/en1 has an active reshape - checking if critical section needs to be restored mdadm: No backup metadata on device-7 mdadm: No backup metadata on device-8 mdadm: No backup metadata on device-9 mdadm: added /dev/sdg to /dev/md/en1 as 1 mdadm: added /dev/sdj to /dev/md/en1 as 2 mdadm: added /dev/sdh to /dev/md/en1 as 3 (possibly out of date) mdadm: added /dev/sdf to /dev/md/en1 as 4 mdadm: no uptodate device for slot 5 of /dev/md/en1 mdadm: added /dev/sdd to /dev/md/en1 as 6 mdadm: /dev/sdb is already in /dev/md/en1 as 7 mdadm: added /dev/sdc to /dev/md/en1 as 8 mdadm: added /dev/sde to /dev/md/en1 as 9 mdadm: added /dev/sdk to /dev/md/en1 as 0 ====================================== And of course, CPU shoots to 100%: ====================================== top - 20:38:44 up 6:13, 3 users, load average: 5.05, 3.25, 1.41 Tasks: 239 total, 3 running, 236 sleeping, 0 stopped, 0 zombie %Cpu(s): 5.9 us, 52.7 sy, 0.0 ni, 0.0 id, 41.4 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 1525400 total, 73124 free, 739576 used, 712700 buff/cache KiB Swap: 25153532 total, 25111140 free, 42392 used. 415840 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6423 root 20 0 0 0 0 R 99.0 0.0 4:43.51 md127_raid6 1166 root 20 0 280588 8780 6192 S 3.0 0.6 0:06.56 polkitd 4022 cisco 20 0 394756 32884 26064 S 3.0 2.2 0:08.77 gnome-disks 1903 cisco 20 0 256660 34060 26280 S 2.0 2.2 0:29.56 Xorg 2093 cisco 20 0 1760364 153572 55572 S 2.0 10.1 0:17.96 gnome-shell ====================================== Then surely the array reshape speed goes back down to nothing: ====================================== # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : active raid6 sdk[0] sde[11] sdc[12] sdd[10] sdf[7] sdj[2] sdg[9] sdb[13] 14650675200 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/8] [UUU_U_UUUU] [=======>.............] reshape = 39.1% (1146348512/2930135040) finish=22057575.1min speed=1K/sec bitmap: 0/22 pages [0KB], 65536KB chunk unused devices: <none> ====================================== The size of the trace file is gigantic, so hopefully it doesn't get trimmed in the email, but any help would be appreciated, thanks in advance: ================ # tracer: nop # # entries-in-buffer/entries-written: 44739/81554230 #P:2 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | md127_raid6-6423 [001] .... 22228.159918: analyse_stripe: check 7: state 0x13 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159919: analyse_stripe: check 6: state 0xa01 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159919: analyse_stripe: check 5: state 0x801 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159920: analyse_stripe: check 4: state 0x811 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159921: analyse_stripe: check 3: state 0x801 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159921: analyse_stripe: check 2: state 0x811 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159922: analyse_stripe: check 1: state 0xa01 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159923: analyse_stripe: check 0: state 0x811 read (null) write (null) written (null) md127_raid6-6423 [001] .... 22228.159924: handle_stripe: locked=2 uptodate=10 to_read=0 to_write=0 failed=4 failed_num=6,5 md127_raid6-6423 [001] .... 22228.159925: schedule_reconstruction: schedule_reconstruction: stripe 2292697672 locked: 4 ops_request: 10 md127_raid6-6423 [001] .... 22228.159925: raid_run_ops: ops_run_reconstruct6: stripe 2292697672 md127_raid6-6423 [001] .... 22228.159943: ops_complete_reconstruct: ops_complete_reconstruct: stripe 2292697672 md127_raid6-6423 [001] .... 22228.159944: handle_stripe: handling stripe 2292697680, state=0x1401 cnt=1, pd_idx=7, qd_idx=8 , check:0, reconstruct:6 =========================================== I trimmed it because of the failure to send issue. However, if someone needs a lengthier snip of the trace, let me know. Thanks, 'Cisco -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html