Re: [PATCH v3] ceph: defer flushing the capsnap if the Fb is used

Xiubo Li <xiubli@xxxxxxxxxx> · Fri, 22 Jan 2021 18:07:54 +0800

On 2021/1/21 22:28, Jeff Layton wrote:
On Mon, 2021-01-18 at 17:10 +0800, Xiubo Li wrote:
On 2021/1/13 5:48, Jeff Layton wrote:
On Sun, 2021-01-10 at 10:01 +0800, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

If the Fb cap is used it means the current inode is flushing the
dirty data to OSD, just defer flushing the capsnap.

URL: https://tracker.ceph.com/issues/48679
URL: https://tracker.ceph.com/issues/48640
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---

V3:
- Add more comments about putting the inode ref
- A small change about the code style

V2:
- Fix inode reference leak bug

   fs/ceph/caps.c | 32 +++++++++++++++++++-------------
   fs/ceph/snap.c |  6 +++---
   2 files changed, 22 insertions(+), 16 deletions(-)

Hi Xiubo,

This patch seems to cause hangs in some xfstests (generic/013, in
particular). I'll take a closer look when I have a chance, but I'm
dropping this for now.
Okay.

BTW, what's your test commands to reproduce it ? I will take a look when
I am free these days or later.

FWIW, I was able to trigger a hang with this patch by running one of the
tests that this patch was intended to fix (snaptest-git-ceph.sh). Here's
the stack trace of the hung task:

# cat /proc/1166/stack
[<0>] wait_woken+0x87/0xb0
[<0>] ceph_get_caps+0x405/0x6a0 [ceph]
[<0>] ceph_write_iter+0x2ca/0xd20 [ceph]
[<0>] new_sync_write+0x10b/0x190
[<0>] vfs_write+0x240/0x390
[<0>] ksys_write+0x58/0xd0
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Hi Jeff,

I have reproduced it, and also tried the libcephfs, which have the same 
logic for this issue, and it worked well.

I will take a look at it later.

Without this patch I could run that test in a loop without issue. This
bug mentions that the original issue occurred during mds thrashing
though, and I haven't tried reproducing that scenario yet:

     https://tracker.ceph.com/issues/48640

From the logs this issue seems not related the thrashing operation, 
during this test the MDS has already been secussfully thrashed.

BRs

Xiubo

Cheers,