Re: [PATCH 5/5] iomap: support RWF_UNCACHED for buffered writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 12, 2019 at 05:57:57PM -0700, Jens Axboe wrote:
> On 12/12/19 5:54 PM, Jens Axboe wrote:
> > On 12/12/19 3:34 PM, Dave Chinner wrote:
> >> On Wed, Dec 11, 2019 at 08:29:43AM -0700, Jens Axboe wrote:
> >>> This adds support for RWF_UNCACHED for file systems using iomap to
> >>> perform buffered writes. We use the generic infrastructure for this,
> >>> by tracking pages we created and calling write_drop_cached_pages()
> >>> to issue writeback and prune those pages.
> >>>
> >>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> >>> ---
> >>>  fs/iomap/apply.c       | 24 ++++++++++++++++++++++++
> >>>  fs/iomap/buffered-io.c | 37 +++++++++++++++++++++++++++++--------
> >>>  include/linux/iomap.h  |  5 +++++
> >>>  3 files changed, 58 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c
> >>> index 562536da8a13..966826ad4bb9 100644
> >>> --- a/fs/iomap/apply.c
> >>> +++ b/fs/iomap/apply.c
> >>> @@ -90,5 +90,29 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags,
> >>>  				     flags, &iomap);
> >>>  	}
> >>>  
> >>> +	if (written && (flags & IOMAP_UNCACHED)) {
> >>> +		struct address_space *mapping = inode->i_mapping;
> >>> +
> >>> +		end = pos + written;
> >>> +		ret = filemap_write_and_wait_range(mapping, pos, end);
> >>> +		if (ret)
> >>> +			goto out;
> >>> +
> >>> +		/*
> >>> +		 * No pages were created for this range, we're done
> >>> +		 */
> >>> +		if (!(iomap.flags & IOMAP_F_PAGE_CREATE))
> >>> +			goto out;
> >>> +
> >>> +		/*
> >>> +		 * Try to invalidate cache pages for the range we just wrote.
> >>> +		 * We don't care if invalidation fails as the write has still
> >>> +		 * worked and leaving clean uptodate pages in the page cache
> >>> +		 * isn't a corruption vector for uncached IO.
> >>> +		 */
> >>> +		invalidate_inode_pages2_range(mapping,
> >>> +				pos >> PAGE_SHIFT, end >> PAGE_SHIFT);
> >>> +	}
> >>> +out:
> >>>  	return written ? written : ret;
> >>>  }
> >>
> >> Just a thought on further optimisation for this for XFS.
> >> IOMAP_UNCACHED is being passed into the filesystem ->iomap_begin
> >> methods by iomap_apply().  Hence the filesystems know that it is
> >> an uncached IO that is being done, and we can tailor allocation
> >> strategies to suit the fact that the data is going to be written
> >> immediately.
> >>
> >> In this case, XFS needs to treat it the same way it treats direct
> >> IO. That is, we do immediate unwritten extent allocation rather than
> >> delayed allocation. This will reduce the allocation overhead and
> >> will optimise for immediate IO locality rather than optimise for
> >> delayed allocation.
> >>
> >> This should just be a relatively simple change to
> >> xfs_file_iomap_begin() along the lines of:
> >>
> >> -	if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && !(flags & IOMAP_DIRECT) &&
> >> -			!IS_DAX(inode) && !xfs_get_extsz_hint(ip)) {
> >> +	if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) &&
> >> +	    !(flags & (IOMAP_DIRECT | IOMAP_UNCACHED)) &&
> >> +	    !IS_DAX(inode) && !xfs_get_extsz_hint(ip)) {
> >> 		/* Reserve delalloc blocks for regular writeback. */
> >> 		return xfs_file_iomap_begin_delay(inode, offset, length, flags,
> >> 				iomap);
> >> 	}
> >>
> >> so that it avoids delayed allocation for uncached IO...
> > 
> > That's very handy! Thanks, I'll add that to the next version. Just out
> > of curiosity, would you prefer this as a separate patch, or just bundle
> > it with the iomap buffered RWF_UNCACHED patch? I'm assuming the latter,
> > and I'll just mention it in the changelog.
> 
> OK, since it's in XFS, it'd be a separate patch.

*nod*

> The code you quote seems
> to be something out-of-tree?

Ah, I quoted the code in the 5.4 release branch, not the 5.5-rc1
tree. I'd forgotten that the xfs_file_iomap_begin() got massively
refactored in the 5.5 merge and I hadn't updated my cscope trees. SO
I'm guessing you want to go looking for the
xfs_buffered_write_iomap_begin() and add another case to this
initial branch:

        /* we can't use delayed allocations when using extent size hints */
        if (xfs_get_extsz_hint(ip))
                return xfs_direct_write_iomap_begin(inode, offset, count,
                                flags, iomap, srcmap);

To make the buffered write IO go down the direct IO allocation path...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux