On 1/25/19 9:32 PM, Song Liu wrote:
On Fri, Jan 25, 2019 at 10:10 AM Michal Soltys <soltys@xxxxxxxx> wrote:
I guess this is because raid5 issues trim at a smaller granularity (4kB).
Could you please confirm this theory? (by running iostat, or blktrace
during mkfs)
I also did the blktrace of just simple blkdiscard on the small raid
(4x32g r5, also ~11 minutes), binary data is around 7gb, parsed data is
around 10gb.
I attached a fragment of the output + summary at:
https://drive.google.com/open?id=12uku_bTrw2VLOVXgSDN4vGJYS2iNzO5y
But it seems (with my untrained eyes) it's roughly as you say - lots of
small 4k discard requests that get merged into larger 512k chunks
submitted to devices (each causing up-to 10ms delays on same device
between completions). E.g.:
8,0 1 269796 53.012284482 7185 A D 5021488 + 8 <- (9,10) 0
8,0 1 269797 53.012284580 7185 A D 5023536 + 8 <- (8,1)
5021488
8,0 1 269798 53.012284685 7185 Q D 5023536 + 8 [md10_raid5]
8,0 1 269799 53.012284833 7185 M D 5023536 + 8 [md10_raid5]
8,0 1 269800 53.012286965 7185 UT N [md10_raid5] 1
8,0 1 269801 53.012287212 7185 I D 5022520 + 1024
[md10_raid5]
8,0 1 269802 53.012295378 405 D D 5022520 + 1024
[kworker/1:1H]
8,0 1 269803 53.021024791 0 C D 5019448 + 1024 [0]
8,0 1 269804 53.031027682 0 C D 5020472 + 1024 [0]
8,0 1 269805 53.040992885 0 C D 5021496 + 1024 [0]
8,0 1 269806 53.041155059 7185 A D 5021496 + 8 <- (9,10) 0
8,0 1 269807 53.041155208 7185 A D 5023544 + 8 <- (8,1)
5021496
8,0 1 269808 53.041155379 7185 Q D 5023544 + 8 [md10_raid5]
8,0 1 269809 53.041155614 7185 G D 5023544 + 8 [md10_raid5]