Re: [PATCH] ext4: allow concurrent unaligned dio overwrites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Brian Foster <bfoster@xxxxxxxxxx> writes:

> We've had reports of significant performance regression of sub-block
> (unaligned) direct writes due to the added exclusivity restrictions
> in ext4. The purpose of the exclusivity requirement for unaligned
> direct writes is to avoid data corruption caused by unserialized
> partial block zeroing in the iomap dio layer across overlapping
> writes.
>
> XFS has similar requirements for the same underlying reasons, yet
> doesn't suffer the extreme performance regression that ext4 does.
> The reason for this is that XFS utilizes IOMAP_DIO_OVERWRITE_ONLY
> mode, which allows for optimistic submission of concurrent unaligned
> I/O and kicks back writes that require partial block zeroing such
> that they can be submitted in a safe, exclusive context. Since ext4
> already performs most of these checks pre-submission, it can support
> something similar without necessarily relying on the iomap flag and
> associated retry mechanism.
>
> Update the dio write submission path to allow concurrent submission
> of unaligned direct writes that are purely overwrite and so will not
> require block zeroing. To improve readability of the various related
> checks, move the unaligned I/O handling down into
> ext4_dio_write_checks(), where the dio draining and force wait logic
> can immediately follow the locking requirement checks. Finally, the
> IOMAP_DIO_OVERWRITE_ONLY flag is set to enable a warning check as a
> precaution should the ext4 overwrite logic ever become inconsistent
> with the zeroing expectations of iomap dio.
>
> The performance improvement of sub-block direct write I/O is shown
> in the following fio test on a 64xcpu guest vm:
>
> Test: fio --name=test --ioengine=libaio --direct=1 --group_reporting
> --overwrite=1 --thread --size=10G --filename=/mnt/fio
> --readwrite=write --ramp_time=10s --runtime=60s --numjobs=8
> --blocksize=2k --iodepth=256 --allow_file_create=0
>
> v6.2:		write: IOPS=4328, BW=8724KiB/s
> v6.2 (patched):	write: IOPS=801k, BW=1565MiB/s
>
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> ---
>
> Hi all,
>
> This survives a couple fstests regression runs (with 4k and 2k block
> sizes) and cleans up the code a bit from the RFC, taking a suggestion
> from Ritesh to move some of the checks into ext4_dio_write_checks().

Thanks for working on the suggestion and rebasing on top.
I liked the way this patch has turned out to be. It's very clear now.

Thanks again for the optimization!!
Looks good to me. Please feel free to add -

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx>



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux