On Fri, Jan 25, 2019 at 10:10 AM Michal Soltys <soltys@xxxxxxxx> wrote: > > I've been testing simple raid5 consisting of 4 ssds (all handling > post-trim reads as 0s safely, as reported by hdaprm -I: "Deterministic > read ZEROs after TRIM") - with 4.19.13 and 4.20.0 as of now (so decently > fresh kernels). > > The drives are itself 2T (1.9T after overprovisioning with hpa), so > initial tests were done with 5.7T array. but that was completely > unusable in practice, the moment discards were attempted. > > Then I created tiny array from 32gb long chunks (pre-discarded manually): > > # echo Y >/sys/module/raid456/parameters/devices_handle_discard_safely > # mdadm -C /dev/md/test32 -e1.1 -l5 -n4 -z32G --assume-clean -n test32 > /dev/sd{a,b,c,d}1 > > Then I did simple mkfs.ext4 /dev/md/test32 > > It took over 11 minutes to finish (of which nearly everything was > accounted for "Discarding device blocks" phase). Extrapolating to > full size array that would be around 7 hours. > > The drives alone (full 1.9T size) handle mkfs.ext4 with full discard > phase in 1 minute or so. Quick test with raid10 works fine as well > (albeit a bit slower than I expected - a bit under 6 minutes; full > fstrim 37min - but that's perfectly fine for daily/weekly fstrim). > > Is raid5's trim implementation expected to be this slow ? I guess this is because raid5 issues trim at a smaller granularity (4kB). Could you please confirm this theory? (by running iostat, or blktrace during mkfs) Thanks, Song