Re: [RFC 2/2] iomap: Support subpage size dirty tracking to improve write performance

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Fri, 28 Oct 2022 19:15:41 +0100

On Fri, Oct 28, 2022 at 10:01:19AM -0700, Darrick J. Wong wrote:
> On Fri, Oct 28, 2022 at 10:00:33AM +0530, Ritesh Harjani (IBM) wrote:
> > Performance testing of below fio workload reveals ~16x performance
> > improvement on nvme with XFS (4k blocksize) on Power (64K pagesize)
> > FIO reported write bw scores improved from around ~28 MBps to ~452 MBps.
> > 
> > <test_randwrite.fio>
> > [global]
> > 	ioengine=psync
> > 	rw=randwrite
> > 	overwrite=1
> > 	pre_read=1
> > 	direct=0
> > 	bs=4k
> > 	size=1G
> > 	dir=./
> > 	numjobs=8
> > 	fdatasync=1
> > 	runtime=60
> > 	iodepth=64
> > 	group_reporting=1
>
> Admittedly I'm not thrilled at the reintroduction of page and iop dirty
> state that are updated in separate places, but OTOH the write
> amplification here is demonstrably horrifying as you point out so it's
> clearly necessary.

Well, *something* is necessary.  I worked on a different approach that
would have similar effects for this exact workload, which was to submit
the I/O for O_SYNC while we still know which part of the page we
dirtied.

Previous discussion:
https://lore.kernel.org/all/YQlgjh2R8OzJkFoB@xxxxxxxxxxxxxxxxxxxx/

Actual patches:
https://lore.kernel.org/all/20220503064008.3682332-1-willy@xxxxxxxxxxxxx/