Re: [PATCH] md/raid5: Improve performance for sequential IO

Logan Gunthorpe <logang@xxxxxxxxxxxx> · Wed, 19 Apr 2023 13:04:21 -0600

On 2023-04-17 11:15, Jan Kara wrote:
> Commit 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()") changed the
> order in which requests for underlying disks are created. Since for
> large sequential IO adding of requests frequently races with md_raid5
> thread submitting bios to underlying disks, this results in a change in
> IO pattern because intermediate states of new order of request creation
> result in more smaller discontiguous requests. For RAID5 on top of three
> rotational disks our performance testing revealed this results in
> regression in write throughput:
> 
> iozone -a -s 131072000 -y 4 -q 8 -i 0 -i 1 -R
> 
> before 7e55c60acfbb:
>               KB  reclen   write rewrite    read    reread
>        131072000       4  493670  525964   524575   513384
>        131072000       8  540467  532880   512028   513703
> 
> after 7e55c60acfbb:
>               KB  reclen   write rewrite    read    reread
>        131072000       4  421785  456184   531278   509248
>        131072000       8  459283  456354   528449   543834
> 
> To reduce the amount of discontiguous requests we can start generating
> requests with the stripe with the lowest chunk offset as that has the
> best chance of being adjacent to IO queued previously. This improves the
> performance to:
>               KB  reclen   write rewrite    read    reread
>        131072000       4  497682  506317   518043   514559
>        131072000       8  514048  501886   506453   504319
> 
> restoring big part of the regression.
> 
> Fixes: 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()")
> Signed-off-by: Jan Kara <jack@xxxxxxx>

Looks good to me. I ran it through some of the functional tests I used
to develop the patch in question and found no issues.

Reviewed-by: Logan Gunthorpe <logang@xxxxxxxxxxxx>

> ---
>  drivers/md/raid5.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 44 insertions(+), 1 deletion(-)
> 
> I'm by no means raid5 expert but this is what I was able to come up with. Any
> opinion or ideas how to fix the problem in a better way?

The other option would be to revert to the old method for spinning disks
and use the pivot option only on SSDs. The pivot optimization really
only applies at IO speeds that can be achieved by fast solid state disks
anyway, and there isn't likely enough IOPS possible on spinning disks to
get enough lock contention that the optimization would provide any benefit.

So it could make sense to just fall back to the old method of preparing
the stripes in logical block order if we are running on spinning disks.
Though, that might be a bit more involved than what this patch does. So
I think this is probably a good approach, unless we want to recover more
of the lost throughput.

Logan