On 19/01/25 21:32, Song Liu wrote: > On Fri, Jan 25, 2019 at 10:10 AM Michal Soltys <soltys@xxxxxxxx> wrote: >> >> I've been testing simple raid5 consisting of 4 ssds (all handling >> post-trim reads as 0s safely, as reported by hdaprm -I: "Deterministic >> read ZEROs after TRIM") - with 4.19.13 and 4.20.0 as of now (so decently >> fresh kernels). >> >> The drives are itself 2T (1.9T after overprovisioning with hpa), so >> initial tests were done with 5.7T array. but that was completely >> unusable in practice, the moment discards were attempted. >> >> Then I created tiny array from 32gb long chunks (pre-discarded manually): >> >> # echo Y >/sys/module/raid456/parameters/devices_handle_discard_safely >> # mdadm -C /dev/md/test32 -e1.1 -l5 -n4 -z32G --assume-clean -n test32 >> /dev/sd{a,b,c,d}1 >> >> Then I did simple mkfs.ext4 /dev/md/test32 >> >> It took over 11 minutes to finish (of which nearly everything was >> accounted for "Discarding device blocks" phase). Extrapolating to >> full size array that would be around 7 hours. >> >> The drives alone (full 1.9T size) handle mkfs.ext4 with full discard >> phase in 1 minute or so. Quick test with raid10 works fine as well >> (albeit a bit slower than I expected - a bit under 6 minutes; full >> fstrim 37min - but that's perfectly fine for daily/weekly fstrim). >> >> Is raid5's trim implementation expected to be this slow ? > > I guess this is because raid5 issues trim at a smaller granularity (4kB). > Could you please confirm this theory? (by running iostat, or blktrace > during mkfs) > Quick iostat -xm 1 shows: - mkfs.ext4 on raid5 (small one 4x32g, that took 11min) avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 1.12 11.62 0.00 87.25 Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util .......... sda 0.00 0.00 0.00 0.00 0.00 12604.00 0.00 100.00 0.00 0.00 4.40 0.00 0.00 0.00 100.00 sdc 0.00 0.00 0.00 0.00 0.00 12604.00 0.00 100.00 0.00 0.00 4.80 0.00 0.00 0.00 99.20 sdb 0.00 0.00 0.00 0.00 0.00 12604.00 0.00 100.00 0.00 0.00 4.65 0.00 0.00 0.00 99.20 sdd 0.00 0.00 0.00 0.00 0.00 12595.00 0.00 100.00 0.00 0.00 4.29 0.00 0.00 0.00 99.20 - comparing that to mkfs.ext4 on raid10 (4x1.9t, ~6min) array: avg-cpu: %user %nice %system %iowait %steal %idle 0.13 0.00 0.75 16.92 0.00 82.21 Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sda 0.00 6.00 0.00 0.02 0.00 11419.00 0.00 99.95 0.00 0.67 4.18 0.00 4.00 158.67 95.20 sdc 0.00 6.00 0.00 0.02 0.00 11421.00 0.00 99.95 0.00 0.67 4.05 0.00 4.00 156.67 94.00 sdb 0.00 6.00 0.00 0.02 0.00 11420.00 0.00 99.95 0.00 0.83 4.04 0.00 4.00 156.00 93.60 sdd 0.00 6.00 0.00 0.02 0.00 11423.00 0.00 99.95 0.00 0.67 3.72 0.00 4.00 150.67 90.40