Re: extremely slow writes to degraded array

pg@xxxxxxxxxxxxxxxxxxxxxx (Peter Grandi) · Tue, 7 Nov 2023 21:14:41 +0000

> 7 disks raid6 array(*1). boot, root and swap on a separate
> SSD. [...] One disk was removed recently and sent for
> replacement. The system felt OK but a few days later I noticed
> an issue... I started copying the full dataset (260GB 8,911k
> files). [...] 4GB (system limit) dirty blocks were created,
> which took 12h to clear.

The MD RAID set is performing well as designed. The achievable
speed is very low except on special workloads, but that low
speed is still good performance for that design, which is close
to optimal for maximizing wait times on your workload.
Maximizing storage wait times is a common optimization in many
IT places.
https://www.sabi.co.uk/blog/17-one.html?170610#170610
https://www.sabi.co.uk/blog/15-one.html?150305#150305

> [...] The copy completed fast but the kthread took about 1.5
> hours at 100%CPU to clear the dirty blocks.
> - When copying more files (3-5GB) the rsync was consuming
>   100%CPU and started pausing every few files (then killed). A
> - 'dd if=/dev/zero of=/data1/no-backup/old-backups/100GB'
>   completed quickly, no issues. [...]
[...]
> +  100.00%     1.60%  [kernel]  [k] ext4_mb_regular_allocator
> +   67.08%     5.93%  [kernel]  [k] ext4_mb_find_good_group_avg_frag_lists
> +   62.47%    42.34%  [kernel]  [k] ext4_mb_good_group
> +   22.51%    11.36%  [kernel]  [k] ext4_get_group_info
> +   19.70%    10.80%  [kernel]  [k] ext4_mb_scan_aligned

My guess is that the filesystem residing on that RAID set is
nearly full, has lots of fragmentation, has lots of small files
and likely two out of three or even all three. Those are also
common optimizations used in many IT places to maximize storage
wait times.

https://www.techrepublic.com/forums/discussions/low-disk-performance-after-reching-75-of-disk-space/
https://www.spinics.net/lists/linux-ext4/msg26470.html