Re: raid456's devices_handle_discard_safely is unusably slow

Song Liu <liu.song.a23@xxxxxxxxx> · Fri, 25 Jan 2019 12:32:03 -0800

On Fri, Jan 25, 2019 at 10:10 AM Michal Soltys <soltys@xxxxxxxx> wrote:
>
> I've been testing simple raid5 consisting of 4 ssds (all handling
> post-trim reads as 0s safely, as reported by hdaprm -I: "Deterministic
> read ZEROs after TRIM") - with 4.19.13 and 4.20.0 as of now (so decently
> fresh kernels).
>
> The drives are itself 2T (1.9T after overprovisioning with hpa), so
> initial tests were done with 5.7T array. but that was completely
> unusable in practice, the moment discards were attempted.
>
> Then I created tiny array from 32gb long chunks (pre-discarded manually):
>
> # echo Y >/sys/module/raid456/parameters/devices_handle_discard_safely
> # mdadm -C /dev/md/test32 -e1.1 -l5 -n4 -z32G --assume-clean -n test32
> /dev/sd{a,b,c,d}1
>
> Then I did simple mkfs.ext4 /dev/md/test32
>
> It took over 11 minutes to finish (of which nearly everything was
> accounted for "Discarding device blocks" phase). Extrapolating to
> full size array that would be around 7 hours.
>
> The drives alone (full 1.9T size) handle mkfs.ext4 with full discard
> phase in 1 minute or so. Quick test with raid10 works fine as well
> (albeit a bit slower than I expected - a bit under 6 minutes; full
> fstrim 37min - but that's perfectly fine for daily/weekly fstrim).
>
> Is raid5's trim implementation expected to be this slow ?

I guess this is because raid5 issues trim at a smaller granularity (4kB).
Could you please confirm this theory? (by running iostat, or blktrace
during mkfs)

Thanks,
Song