Re: performance regression between 6.1.x and 5.15.x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


> On Wed, May 10, 2023 at 01:46:49PM +0800, Wang Yugui wrote:
> > > Ok, that is further back in time than I expected. In terms of XFS,
> > > there are only two commits between 5.16..5.17 that might impact
> > > performance:
> > > 
> > > ebb7fb1557b1 ("xfs, iomap: limit individual ioend chain lengths in writeback")
> > > 
> > > and
> > > 
> > > 6795801366da ("xfs: Support large folios")
> > > 
> > > To test whether ebb7fb1557b1 is the cause, go to
> > > fs/iomap/buffered-io.c and change:
> > > 
> > > -#define IOEND_BATCH_SIZE        4096
> > > +#define IOEND_BATCH_SIZE        1048576
> > > This will increase the IO submission chain lengths to at least 4GB
> > > from the 16MB bound that was placed on 5.17 and newer kernels.
> > > 
> > > To test whether 6795801366da is the cause, go to fs/xfs/xfs_icache.c
> > > and comment out both calls to mapping_set_large_folios(). This will
> > > ensure the page cache only instantiates single page folios the same
> > > as 5.16 would have.
> > 
> > 6.1.x with 'mapping_set_large_folios remove' and 'IOEND_BATCH_SIZE=1048576'
> > 	fio WRITE: bw=6451MiB/s (6764MB/s)
> > 
> > still  performance regression when compare to linux 5.16.20
> > 	fio WRITE: bw=7666MiB/s (8039MB/s),
> > 
> > but the performance regression is not too big, then difficult to bisect.
> > We noticed samle level  performance regression  on btrfs too.
> > so maby some problem of some code that is  used by both btrfs and xfs
> > such as iomap and mm/folio.
> 
> Yup, that's quite possibly something like the multi-gen LRU changes,
> but that's not the regression we need to find. :/
> 
> > 6.1.x  with 'mapping_set_large_folios remove' only'
> > 	fio   WRITE: bw=2676MiB/s (2806MB/s)
> > 
> > 6.1.x with 'IOEND_BATCH_SIZE=1048576' only'
> > 	fio WRITE: bw=5092MiB/s (5339MB/s),
> > 	fio  WRITE: bw=6076MiB/s (6371MB/s)
> > 
> > maybe we need more fix or ' ebb7fb1557b1 ("xfs, iomap: limit
> > individual ioend chain lengths in writeback")'.
> 
> OK, can you re-run the two 6.1.x kernels above (the slow and the
> fast) and record the output of `iostat -dxm 1` whilst the
> fio test is running? I want to see what the overall differences in
> the IO load on the devices are between the two runs. This will tell
> us how the IO sizes and queue depths change between the two kernels,
> etc.

`iostat -dxm 1` result saved in attachment file.
good.txt	good performance
bad.txt		bad performance

Best Regards
Wang Yugui (wangyugui@xxxxxxxxxxxx)
2023/05/10

> 
> Right now I'm suspecting a contention interaction between write(),
> do_writepages() and folio_end_writeback()...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx

Attachment: good.txt
Description: Binary data

Attachment: bad.txt
Description: Binary data


[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux