> 7 disks raid6 array(*1). boot, root and swap on a separate > SSD. [...] One disk was removed recently and sent for > replacement. The system felt OK but a few days later I noticed > an issue... I started copying the full dataset (260GB 8,911k > files). [...] 4GB (system limit) dirty blocks were created, > which took 12h to clear. The MD RAID set is performing well as designed. The achievable speed is very low except on special workloads, but that low speed is still good performance for that design, which is close to optimal for maximizing wait times on your workload. Maximizing storage wait times is a common optimization in many IT places. https://www.sabi.co.uk/blog/17-one.html?170610#170610 https://www.sabi.co.uk/blog/15-one.html?150305#150305 > [...] The copy completed fast but the kthread took about 1.5 > hours at 100%CPU to clear the dirty blocks. > - When copying more files (3-5GB) the rsync was consuming > 100%CPU and started pausing every few files (then killed). A > - 'dd if=/dev/zero of=/data1/no-backup/old-backups/100GB' > completed quickly, no issues. [...] [...] > + 100.00% 1.60% [kernel] [k] ext4_mb_regular_allocator > + 67.08% 5.93% [kernel] [k] ext4_mb_find_good_group_avg_frag_lists > + 62.47% 42.34% [kernel] [k] ext4_mb_good_group > + 22.51% 11.36% [kernel] [k] ext4_get_group_info > + 19.70% 10.80% [kernel] [k] ext4_mb_scan_aligned My guess is that the filesystem residing on that RAID set is nearly full, has lots of fragmentation, has lots of small files and likely two out of three or even all three. Those are also common optimizations used in many IT places to maximize storage wait times. https://www.techrepublic.com/forums/discussions/low-disk-performance-after-reching-75-of-disk-space/ https://www.spinics.net/lists/linux-ext4/msg26470.html