On Wed, May 10, 2023 at 01:46:49PM +0800, Wang Yugui wrote: > > Ok, that is further back in time than I expected. In terms of XFS, > > there are only two commits between 5.16..5.17 that might impact > > performance: > > > > ebb7fb1557b1 ("xfs, iomap: limit individual ioend chain lengths in writeback") > > > > and > > > > 6795801366da ("xfs: Support large folios") > > > > To test whether ebb7fb1557b1 is the cause, go to > > fs/iomap/buffered-io.c and change: > > > > -#define IOEND_BATCH_SIZE 4096 > > +#define IOEND_BATCH_SIZE 1048576 > > This will increase the IO submission chain lengths to at least 4GB > > from the 16MB bound that was placed on 5.17 and newer kernels. > > > > To test whether 6795801366da is the cause, go to fs/xfs/xfs_icache.c > > and comment out both calls to mapping_set_large_folios(). This will > > ensure the page cache only instantiates single page folios the same > > as 5.16 would have. > > 6.1.x with 'mapping_set_large_folios remove' and 'IOEND_BATCH_SIZE=1048576' > fio WRITE: bw=6451MiB/s (6764MB/s) > > still performance regression when compare to linux 5.16.20 > fio WRITE: bw=7666MiB/s (8039MB/s), > > but the performance regression is not too big, then difficult to bisect. > We noticed samle level performance regression on btrfs too. > so maby some problem of some code that is used by both btrfs and xfs > such as iomap and mm/folio. Yup, that's quite possibly something like the multi-gen LRU changes, but that's not the regression we need to find. :/ > 6.1.x with 'mapping_set_large_folios remove' only' > fio WRITE: bw=2676MiB/s (2806MB/s) > > 6.1.x with 'IOEND_BATCH_SIZE=1048576' only' > fio WRITE: bw=5092MiB/s (5339MB/s), > fio WRITE: bw=6076MiB/s (6371MB/s) > > maybe we need more fix or ' ebb7fb1557b1 ("xfs, iomap: limit > individual ioend chain lengths in writeback")'. OK, can you re-run the two 6.1.x kernels above (the slow and the fast) and record the output of `iostat -dxm 1` whilst the fio test is running? I want to see what the overall differences in the IO load on the devices are between the two runs. This will tell us how the IO sizes and queue depths change between the two kernels, etc. Right now I'm suspecting a contention interaction between write(), do_writepages() and folio_end_writeback()... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx