Re: [PATCH 5/5] ceph: fold ceph_update_writeable_page into ceph_write_begin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2021-06-11 at 16:20 +0100, Matthew Wilcox wrote:
> On Fri, Jun 11, 2021 at 04:11:49PM +0100, David Howells wrote:
> > Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > > On Fri, 2021-06-11 at 10:14 -0400, Andrew W Elble wrote:
> > > > We're seeing file corruption while running 5.10, bisected to 1cc1699070bd:
> > > > 
> > > > > > +static int ceph_write_begin(struct file *file, struct address_space *mapping,
> > > > > > +			    loff_t pos, unsigned len, unsigned flags,
> > > > > > +			    struct page **pagep, void **fsdata)
> > > > 
> > > > <snip>
> > > > 
> > > > > > +		/*
> > > > > > +		 * In some cases we don't need to read at all:
> > > > > > +		 * - full page write
> > > > > > +		 * - write that lies completely beyond EOF
> > > > > > +		 * - write that covers the the page from start to EOF or beyond it
> > > > > > +		 */
> > > > > > +		if ((pos_in_page == 0 && len == PAGE_SIZE) ||
> > > > > > +		    (pos >= i_size_read(inode)) ||
> > > > 
> > > > Shouldn't this be '((pos & PAGE_MASK) >= i_size_read(inode)) ||' ?
> > > > 
> > > > Seems like fs/netfs/read_helper.c currently has the same issue?
> 
> How does (pos & PAGE_MASK) >= i_size_read() make sense?  That could only
> be true if the file is less than a page in size, whereas it should
> always be true if the write starts outside the current i_size.
> 

Yeah, I guess what we really need is to round the i_size up to the start
of the next page and then compare whether pos is beyond that.

> > That's not quite right either.  page may be larger than PAGE_MASK if
> > grab_cache_page_write_begin() returns a THP (if that's possible).
> > 
> > Maybe:
> > 
> > 	(pos & thp_size(page) - 1) >= i_size_read(inode)
> > 
> > Really, we want something like thp_pos().  Maybe Willy has something like that
> > up his sleeve.
> > 
> > In fact, in netfs_write_begin(), index and pos_in_page should be calculated
> > after grab_cache_page_write_begin() has been called, just in case the new page
> > extends before the page containing the requested position.
> 
> Yes.  I do that kind of thing in iomap.  What you're doing there looks
> a bit like offset_in_folio(), but I don't understand the problem with
> just checking pos against i_size directly.
> 

Suppose the i_size is 3 and you do a 1 byte write at offset 5. You're
beyond the EOF, so the condition would return true, but you still need
to read in the start of the page in that case.

I think we probably need a testcase that does this in xfstests:

open file
write 3 bytes at start
close
unmount or drop pagecache in some way
then write 1 byte at offset 5
see whether the resulting contents match the expect ones

>
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio
> contains a number of commits that start 'iomap:' which may be of interest.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux