On Wed, 2021-11-17 at 21:40 +0800, Xiubo Li wrote: > On 11/17/21 9:28 PM, Jeff Layton wrote: > > On Wed, 2021-11-17 at 09:21 +0800, Xiubo Li wrote: > > > On 11/17/21 4:06 AM, Jeff Layton wrote: > > > > On Tue, 2021-11-16 at 17:20 +0800, xiubli@xxxxxxxxxx wrote: > [...] > > > > > So when filling the inode it will truncate the pagecache by using > > > > > truncate_sizeA again, which makes no sense and will trim the inocent > > > > > pages. > > > > > > > > > Is there a reproducer for this? It would be nice to put something in > > > > xfstests for it if so. > > > In xfstests' generic/075 has already testing this, but i didn't see any > > > issue it reproduce. > > > > > > I just found this strange logs when it's doing > > > something like: > > > > > > truncateA 0x10000 --> 0x2000 > > > > > > truncateB 0x2000 --> 0x8000 > > > > > > truncateC 0x8000 --> 0x6000 > > > > > > For the truncateC, the log says: > > > > > > ceph: truncate_size 0x2000 -> 0x6000 > > > > > > > > > The problem is that the truncateB will also do the vmtruncate by using > > > the 0x2000 instead, the vmtruncate will not flush the dirty pages to the > > > OSD and will just discard them from the pagecaches. Then we may lost > > > some new updated data in case there has any write before the truncateB > > > in range [0x2000, 0x8000). > > > > > > > > Is that reproducible without the fscrypt size handling changes? I > > haven't seen generic/075 fail on stock kernels. > > Yeah, there is no error about this, since there has no any write between > truncateA and truncateB, if there has I am afraid the test will fail in > theory. > > Interesting. It might be worth trying to craft a test that does that and see if you can get it to fail on stock kernels. I'd feel better about this if we had a reliable test for it. > > If this is a generic bug, then we should go ahead and fix it in > > mainline. If it's a problem only with fscrypt enabled, then let's plan > > to roll this patch into those changes. > > > I think it makes sense always to check the truncate_seq and > truncate_size here in kclient, or at least should we warn it in case the > MDS will do this again in future ? > > The truncate_seq and truncate_size should always be changed at the same > time. > > Make sense ? > > I think so. I'll plan to do some testing with this today and will plan to pull it into the testing branch. It would be nice to have a reliable test for this issue though, so we can ensure it doesn't regress later. > > > > > > > > > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > > > > > --- > > > > > fs/ceph/inode.c | 5 +++-- > > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c > > > > > index 1b4ce453d397..b4f784684e64 100644 > > > > > --- a/fs/ceph/inode.c > > > > > +++ b/fs/ceph/inode.c > > > > > @@ -738,10 +738,11 @@ int ceph_fill_file_size(struct inode *inode, int issued, > > > > > * don't hold those caps, then we need to check whether > > > > > * the file is either opened or mmaped > > > > > */ > > > > > - if ((issued & (CEPH_CAP_FILE_CACHE| > > > > > + if (ci->i_truncate_size != truncate_size && > > > > > + ((issued & (CEPH_CAP_FILE_CACHE| > > > > > CEPH_CAP_FILE_BUFFER)) || > > > > > mapping_mapped(inode->i_mapping) || > > > > > - __ceph_is_file_opened(ci)) { > > > > > + __ceph_is_file_opened(ci))) { > > > > > ci->i_truncate_pending++; > > > > > queue_trunc = 1; > > > > > } > -- Jeff Layton <jlayton@xxxxxxxxxx>