Re: [PATCH] ceph: do not truncate pagecache if truncate size doesn't change

Xiubo Li <xiubli@xxxxxxxxxx> · Thu, 18 Nov 2021 17:59:31 +0800

On 11/18/21 12:46 PM, Xiubo Li wrote:

On 11/18/21 5:10 AM, Jeff Layton wrote:
On Tue, 2021-11-16 at 17:20 +0800, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

In case truncating a file to a smaller sizeA, the sizeA will be kept
in truncate_size. And if truncate the file to a bigger sizeB, the
MDS will only increase the truncate_seq, but still using the sizeA as
the truncate_size.

So when filling the inode it will truncate the pagecache by using
truncate_sizeA again, which makes no sense and will trim the inocent
pages.

Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
  fs/ceph/inode.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 1b4ce453d397..b4f784684e64 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -738,10 +738,11 @@ int ceph_fill_file_size(struct inode *inode, 
int issued,
               * don't hold those caps, then we need to check whether
               * the file is either opened or mmaped
               */
-            if ((issued & (CEPH_CAP_FILE_CACHE|
+            if (ci->i_truncate_size != truncate_size &&
+                ((issued & (CEPH_CAP_FILE_CACHE|
                         CEPH_CAP_FILE_BUFFER)) ||
                  mapping_mapped(inode->i_mapping) ||
-                __ceph_is_file_opened(ci)) {
+                __ceph_is_file_opened(ci))) {
                  ci->i_truncate_pending++;
                  queue_trunc = 1;
              }

This patch causes xfstest generic/129 to hang at umount time, when
applied on top of the testing branch, and run (w/o fscrypt being
enabled). The call stack looks like this:

         [<0>] wb_wait_for_completion+0xc3/0x120
         [<0>] __writeback_inodes_sb_nr+0x151/0x190
         [<0>] sync_filesystem+0x59/0x100
         [<0>] generic_shutdown_super+0x44/0x1d0
         [<0>] kill_anon_super+0x1e/0x40
         [<0>] ceph_kill_sb+0x5f/0xc0 [ceph]
         [<0>] deactivate_locked_super+0x5d/0xd0
         [<0>] cleanup_mnt+0x1f4/0x260
         [<0>] task_work_run+0x8b/0xc0
         [<0>] exit_to_user_mode_prepare+0x267/0x270
         [<0>] syscall_exit_to_user_mode+0x16/0x50
         [<0>] do_syscall_64+0x48/0x90
         [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae

I suspect this is causing dirty data to get stuck in the cache somehow,
but I haven't tracked down the cause in detail.

BTW, could you reproduce this every time ?

I have tried this based the "ceph-client/wip-fscrypt-size" branch by 
both enabling and disabling the "test_dummy_encryption" for many 
times, all worked well for me.

And I also tried to test this patch based "testing" branch without 
fscrypt being enabled for many times, it also worked well for me:

[root@lxbceph1 xfstests]# date; ./check generic/129; date
Thu Nov 18 12:22:25 CST 2021
FSTYP         -- ceph
PLATFORM      -- Linux/x86_64 lxbceph1 5.15.0+
MKFS_OPTIONS  -- 10.72.7.17:40543:/testB
MOUNT_OPTIONS -- -o 
name=admin,secret=AQDS3IFhEtxvORAAxn1d4FVN2bRUsc/TZMpQvQ== -o 
context=system_u:object_r:root_t:s0 10.72.47.117:40543:/testB 
/mnt/kcephfs/testD

generic/129 648s ... 603s
Ran: generic/129
Passed all 1 tests

Thu Nov 18 12:32:33 CST 2021


Have run this for several hours, till now no stuck happens locally:

  $ while [ 1 ]; do date; ./check generic/129; date; done

Is it possible that you were still using the old binaries you built ?


Thanks

BRs

-- Xiubo