On Fri, Jun 11, 2021 at 04:11:49PM +0100, David Howells wrote: > Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > On Fri, 2021-06-11 at 10:14 -0400, Andrew W Elble wrote: > > > We're seeing file corruption while running 5.10, bisected to 1cc1699070bd: > > > > > > > > +static int ceph_write_begin(struct file *file, struct address_space *mapping, > > > > > + loff_t pos, unsigned len, unsigned flags, > > > > > + struct page **pagep, void **fsdata) > > > > > > <snip> > > > > > > > > + /* > > > > > + * In some cases we don't need to read at all: > > > > > + * - full page write > > > > > + * - write that lies completely beyond EOF > > > > > + * - write that covers the the page from start to EOF or beyond it > > > > > + */ > > > > > + if ((pos_in_page == 0 && len == PAGE_SIZE) || > > > > > + (pos >= i_size_read(inode)) || > > > > > > Shouldn't this be '((pos & PAGE_MASK) >= i_size_read(inode)) ||' ? > > > > > > Seems like fs/netfs/read_helper.c currently has the same issue? How does (pos & PAGE_MASK) >= i_size_read() make sense? That could only be true if the file is less than a page in size, whereas it should always be true if the write starts outside the current i_size. > That's not quite right either. page may be larger than PAGE_MASK if > grab_cache_page_write_begin() returns a THP (if that's possible). > > Maybe: > > (pos & thp_size(page) - 1) >= i_size_read(inode) > > Really, we want something like thp_pos(). Maybe Willy has something like that > up his sleeve. > > In fact, in netfs_write_begin(), index and pos_in_page should be calculated > after grab_cache_page_write_begin() has been called, just in case the new page > extends before the page containing the requested position. Yes. I do that kind of thing in iomap. What you're doing there looks a bit like offset_in_folio(), but I don't understand the problem with just checking pos against i_size directly. https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio contains a number of commits that start 'iomap:' which may be of interest.