Re: [PATCH] ceph: do not truncate pagecache if truncate size doesn't change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-11-17 at 21:40 +0800, Xiubo Li wrote:
> On 11/17/21 9:28 PM, Jeff Layton wrote:
> > On Wed, 2021-11-17 at 09:21 +0800, Xiubo Li wrote:
> > > On 11/17/21 4:06 AM, Jeff Layton wrote:
> > > > On Tue, 2021-11-16 at 17:20 +0800, xiubli@xxxxxxxxxx wrote:
> [...]
> > > > > So when filling the inode it will truncate the pagecache by using
> > > > > truncate_sizeA again, which makes no sense and will trim the inocent
> > > > > pages.
> > > > > 
> > > > Is there a reproducer for this? It would be nice to put something in
> > > > xfstests for it if so.
> > > In xfstests' generic/075 has already testing this, but i didn't see any
> > > issue it reproduce.
> > > 
> > > I just found this strange logs when it's doing
> > > something like:
> > > 
> > > truncateA 0x10000 --> 0x2000
> > > 
> > > truncateB 0x2000   --> 0x8000
> > > 
> > > truncateC 0x8000   --> 0x6000
> > > 
> > > For the truncateC, the log says:
> > > 
> > > ceph:  truncate_size 0x2000 -> 0x6000
> > > 
> > > 
> > > The problem is that the truncateB will also do the vmtruncate by using
> > > the 0x2000 instead, the vmtruncate will not flush the dirty pages to the
> > > OSD and will just discard them from the pagecaches. Then we may lost
> > > some new updated data in case there has any write before the truncateB
> > > in range [0x2000, 0x8000).
> > > 
> > > 
> > Is that reproducible without the fscrypt size handling changes? I
> > haven't seen generic/075 fail on stock kernels.
> 
> Yeah, there is no error about this, since there has no any write between 
> truncateA and truncateB, if there has I am afraid the test will fail in 
> theory.
> 
> 

Interesting. It might be worth trying to craft a test that does that and
see if you can get it to fail on stock kernels. I'd feel better about
this if we had a reliable test for it.

> > If this is a generic bug, then we should go ahead and fix it in
> > mainline. If it's a problem only with fscrypt enabled, then let's plan
> > to roll this patch into those changes.
> > 
> I think it makes sense always to check the truncate_seq and 
> truncate_size here in kclient, or at least should we warn it in case the 
> MDS will do this again in future ?
> 
> The truncate_seq and truncate_size should always be changed at the same 
> time.
> 
> Make sense ?
> 
> 

I think so. I'll plan to do some testing with this today and will plan
to pull it into the testing branch. It would be nice to have a reliable
test for this issue though, so we can ensure it doesn't regress later.

> 
> > > 
> > > > > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> > > > > ---
> > > > >    fs/ceph/inode.c | 5 +++--
> > > > >    1 file changed, 3 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> > > > > index 1b4ce453d397..b4f784684e64 100644
> > > > > --- a/fs/ceph/inode.c
> > > > > +++ b/fs/ceph/inode.c
> > > > > @@ -738,10 +738,11 @@ int ceph_fill_file_size(struct inode *inode, int issued,
> > > > >    			 * don't hold those caps, then we need to check whether
> > > > >    			 * the file is either opened or mmaped
> > > > >    			 */
> > > > > -			if ((issued & (CEPH_CAP_FILE_CACHE|
> > > > > +			if (ci->i_truncate_size != truncate_size &&
> > > > > +			    ((issued & (CEPH_CAP_FILE_CACHE|
> > > > >    				       CEPH_CAP_FILE_BUFFER)) ||
> > > > >    			    mapping_mapped(inode->i_mapping) ||
> > > > > -			    __ceph_is_file_opened(ci)) {
> > > > > +			    __ceph_is_file_opened(ci))) {
> > > > >    				ci->i_truncate_pending++;
> > > > >    				queue_trunc = 1;
> > > > >    			}
> 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux