On Fri, 2021-06-11 at 16:20 +0100, Matthew Wilcox wrote: > On Fri, Jun 11, 2021 at 04:11:49PM +0100, David Howells wrote: > > Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > > On Fri, 2021-06-11 at 10:14 -0400, Andrew W Elble wrote: > > > > We're seeing file corruption while running 5.10, bisected to 1cc1699070bd: > > > > > > > > > > +static int ceph_write_begin(struct file *file, struct address_space *mapping, > > > > > > + loff_t pos, unsigned len, unsigned flags, > > > > > > + struct page **pagep, void **fsdata) > > > > > > > > <snip> > > > > > > > > > > + /* > > > > > > + * In some cases we don't need to read at all: > > > > > > + * - full page write > > > > > > + * - write that lies completely beyond EOF > > > > > > + * - write that covers the the page from start to EOF or beyond it > > > > > > + */ > > > > > > + if ((pos_in_page == 0 && len == PAGE_SIZE) || > > > > > > + (pos >= i_size_read(inode)) || > > > > > > > > Shouldn't this be '((pos & PAGE_MASK) >= i_size_read(inode)) ||' ? > > > > > > > > Seems like fs/netfs/read_helper.c currently has the same issue? > > How does (pos & PAGE_MASK) >= i_size_read() make sense? That could only > be true if the file is less than a page in size, whereas it should > always be true if the write starts outside the current i_size. > Yeah, I guess what we really need is to round the i_size up to the start of the next page and then compare whether pos is beyond that. > > That's not quite right either. page may be larger than PAGE_MASK if > > grab_cache_page_write_begin() returns a THP (if that's possible). > > > > Maybe: > > > > (pos & thp_size(page) - 1) >= i_size_read(inode) > > > > Really, we want something like thp_pos(). Maybe Willy has something like that > > up his sleeve. > > > > In fact, in netfs_write_begin(), index and pos_in_page should be calculated > > after grab_cache_page_write_begin() has been called, just in case the new page > > extends before the page containing the requested position. > > Yes. I do that kind of thing in iomap. What you're doing there looks > a bit like offset_in_folio(), but I don't understand the problem with > just checking pos against i_size directly. > Suppose the i_size is 3 and you do a 1 byte write at offset 5. You're beyond the EOF, so the condition would return true, but you still need to read in the start of the page in that case. I think we probably need a testcase that does this in xfstests: open file write 3 bytes at start close unmount or drop pagecache in some way then write 1 byte at offset 5 see whether the resulting contents match the expect ones > https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio > contains a number of commits that start 'iomap:' which may be of interest. -- Jeff Layton <jlayton@xxxxxxxxxx>