Re: extremely slow writes to array [now not degraded]

Johannes Truschnigg <johannes@xxxxxxxxxxxxxxx> · Mon, 13 Nov 2023 10:20:46 +0100

Interesting data; thanks for providing it. Unfortunately, I am not familiar
with that part of kernel code at all, but there's two observations that I can
contribute:

According to kernel source, `ext4_mb_scan_aligned` is a "special case for
storages like raid5", where "we try to find stripe-aligned chunks for
stripe-size-multiple requests" - and it seems that on your system, it might be
trying a tad too hard. I don't have a kernel source tree handy right now to
take a look at what might have changed in the function and any of its
calle[er]s during recent times, but it's the first place I'd go take a closer
look at.

Also, there's a recent Kernel bugzilla entry[0] that observes a similarly
pathological behavior from ext4 on a single disk of spinning rust where that
particular function appears in the call stack, and which revolves around an
mkfs-time-enabled feature which will, afaik, happen to also be set if
mke2fs(8) detects md RAID in the storage stack beneath the device it is
supposed to format (and which SHOULD get set, esp. for parity-based RAID).

Chances are you may be able to disable this particular optimization by running
`tune2fs -E stride=0` against the filesystem's backing array (be warned that I
did NOT verify if that might screw your data, which it very well could!!) and
remounting it afterwards, to check if that is indeed (part of) the underlying
cause to the poor performance you see. If you choose to try that, make sure to
record the current stride-size, so you may re-apply it at a later time
(`tune2fs -l` should do).

[0]: https://bugzilla.kernel.org/show_bug.cgi?id=217965

-- 
with best regards:
- Johannes Truschnigg ( johannes@xxxxxxxxxxxxxxx )

www:   https://johannes.truschnigg.info/
phone: +436502133337
xmpp:  johannes@xxxxxxxxxxxxxxx
Attachment:
signature.asc

Description: PGP signature