Re: [PATCH] md/raid5: Improve performance for sequential IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 19-04-23 22:26:07, Song Liu wrote:
> On Wed, Apr 19, 2023 at 12:04 PM Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:
> > On 2023-04-17 11:15, Jan Kara wrote:
> > > Commit 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()") changed the
> > > order in which requests for underlying disks are created. Since for
> > > large sequential IO adding of requests frequently races with md_raid5
> > > thread submitting bios to underlying disks, this results in a change in
> > > IO pattern because intermediate states of new order of request creation
> > > result in more smaller discontiguous requests. For RAID5 on top of three
> > > rotational disks our performance testing revealed this results in
> > > regression in write throughput:
> > >
> > > iozone -a -s 131072000 -y 4 -q 8 -i 0 -i 1 -R
> > >
> > > before 7e55c60acfbb:
> > >               KB  reclen   write rewrite    read    reread
> > >        131072000       4  493670  525964   524575   513384
> > >        131072000       8  540467  532880   512028   513703
> > >
> > > after 7e55c60acfbb:
> > >               KB  reclen   write rewrite    read    reread
> > >        131072000       4  421785  456184   531278   509248
> > >        131072000       8  459283  456354   528449   543834
> > >
> > > To reduce the amount of discontiguous requests we can start generating
> > > requests with the stripe with the lowest chunk offset as that has the
> > > best chance of being adjacent to IO queued previously. This improves the
> > > performance to:
> > >               KB  reclen   write rewrite    read    reread
> > >        131072000       4  497682  506317   518043   514559
> > >        131072000       8  514048  501886   506453   504319
> > >
> > > restoring big part of the regression.
> > >
> > > Fixes: 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()")
> > > Signed-off-by: Jan Kara <jack@xxxxxxx>
> >
> > Looks good to me. I ran it through some of the functional tests I used
> > to develop the patch in question and found no issues.
> >
> > Reviewed-by: Logan Gunthorpe <logang@xxxxxxxxxxxx>
> 
> Thanks Jan and Logan! I will apply this to md-next. But it may not make
> 6.4 release, as we are already at rc7.

OK, sure, this is not a critical issue.

> > > ---
> > >  drivers/md/raid5.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 44 insertions(+), 1 deletion(-)
> > >
> > > I'm by no means raid5 expert but this is what I was able to come up with. Any
> > > opinion or ideas how to fix the problem in a better way?
> >
> > The other option would be to revert to the old method for spinning disks
> > and use the pivot option only on SSDs. The pivot optimization really
> > only applies at IO speeds that can be achieved by fast solid state disks
> > anyway, and there isn't likely enough IOPS possible on spinning disks to
> > get enough lock contention that the optimization would provide any benefit.
> >
> > So it could make sense to just fall back to the old method of preparing
> > the stripes in logical block order if we are running on spinning disks.
> > Though, that might be a bit more involved than what this patch does. So
> > I think this is probably a good approach, unless we want to recover more
> > of the lost throughput.
> 
> How about we only do the optimization in this patch for spinning disks?
> Something like:
> 
>         if (likely(conf->reshape_progress == MaxSector) &&
>             !blk_queue_nonrot(mddev->queue))
>                 logical_sector = raid5_bio_lowest_chunk_sector(conf, bi);

Yeah, makes sense. On SSD disks I was not able to observe any adverse
effects of the different stripe order. Will you update the patch or should
I respin it?

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux