Re: [PATCH] ceph: force updating the msg pointer in non-split case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/17/23 20:53, Venky Shankar wrote:
On Wed, May 17, 2023 at 6:18 PM Venky Shankar <vshankar@xxxxxxxxxx> wrote:
Hey Xiubo,

On Wed, May 17, 2023 at 4:45 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:

On 5/17/23 19:04, Xiubo Li wrote:
On 5/17/23 18:31, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 7:24 AM <xiubli@xxxxxxxxxx> wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the
request may still contain a list of 'split_realms', and we need
to skip it anyway. Or it will be parsed as a corrupt snaptrace.

Cc: stable@xxxxxxxxxxxxxxx
Cc: Frank Schilder <frans@xxxxxx>
Reported-by: Frank Schilder <frans@xxxxxx>
URL: https://tracker.ceph.com/issues/61200
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
   fs/ceph/snap.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index 0e59e95a96d9..d95dfe16b624 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -1114,6 +1114,9 @@ void ceph_handle_snap(struct ceph_mds_client
*mdsc,
                                  continue;
                          adjust_snap_realm_parent(mdsc, child,
realm->ino);
                  }
+       } else {
+               p += sizeof(u64) * num_split_inos;
+               p += sizeof(u64) * num_split_realms;
          }

          /*
--
2.40.1

Hi Xiubo,

This code appears to be very old -- it goes back to the initial commit
963b61eb041e ("ceph: snapshot management") in 2009.  Do you have an
explanation for why this popped up only now?
As I remembered we hit this before in one cu BZ last year, but I
couldn't remember exactly which one.  But I am not sure whether @Jeff
saw this before I joint ceph team.

@Venky,

Do you remember which one ? As I remembered this is why we fixed the
snaptrace issue by blocking all the IOs and at the same time
blocklisting the kclient before.

Before the kcleint won't dump the corrupted msg and we don't know what
was wrong with the msg and also we added code to dump the msg in the
above fix.
The "corrupted" snaptrace issue happened just after the mds asserted
hitting the metadata corruption (dentry first corrupted) and it
_seemed_ that this corruption somehow triggered a corrupted snaptrace
to be sent to the client.
[sent message a bit early - cotd...]

But I think you found the issue - the message dump did help and its
not related to the dentry first corruption.

Yeah, this one is not related to dentry first corruption.

@Ilya

I have created one ceph PR to fix it in MDS side https://tracker.ceph.com/issues/61217 and https://github.com/ceph/ceph/pull/51536.

Let's keep this kclient patch to make to be compatible with the old cephs. At least users could only update the kclient node to fix this issue.

Thanks

- Xiubo


Thanks

- Xiubo

Has MDS always been including split_inos and split_realms arrays in
!CEPH_SNAP_OP_SPLIT case or is this a recent change?  If it's a recent
change, I'd argue that this needs to be addressed on the MDS side.
While in MDS side for the _UPDATE op it won't send the 'split_realm'
list just before the commit in 2017:

commit 93e7267757508520dfc22cff1ab20558bd4a44d4
Author: Yan, Zheng <zyan@xxxxxxxxxx>
Date:   Fri Jul 21 21:40:46 2017 +0800

     mds: send snap related messages centrally during mds recovery

     sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to
     clients centrally in MDCache::open_snaprealms()

     Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx>

Before this commit it will only send the 'split_realm' list for the
_SPLIT op.


The following the snaptrace:

[Wed May 10 16:03:06 2023] header: 00000000: 05 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023] header: 00000010: 12 03 7f 00 01 00 00 01
00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023] header: 00000020: 00 00 00 00 02 01 00 00
00 00 00 00 00 01 00 00  ................
[Wed May 10 16:03:06 2023] header: 00000030: 00 98 0d 60
93                                   ...`.
[Wed May 10 16:03:06 2023]  front: 00000000: 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  ................ <<== The op is 0, which is
'CEPH_SNAP_OP_UPDATE'
[Wed May 10 16:03:06 2023]  front: 00000010: 0c 00 00 00 88 00 00 00
d1 c0 71 38 00 01 00 00  ..........q8....           <<== The '0c' is
the split_realm number
[Wed May 10 16:03:06 2023]  front: 00000020: 22 c8 71 38 00 01 00 00
d7 c7 71 38 00 01 00 00  ".q8......q8....       <<== All the 'q8' are
the ino#
[Wed May 10 16:03:06 2023]  front: 00000030: d9 c7 71 38 00 01 00 00
d4 c7 71 38 00 01 00 00  ..q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000040: f1 c0 71 38 00 01 00 00
d4 c0 71 38 00 01 00 00  ..q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000050: 20 c8 71 38 00 01 00 00
1d c8 71 38 00 01 00 00   .q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000060: ec c0 71 38 00 01 00 00
d6 c0 71 38 00 01 00 00  ..q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000070: ef c0 71 38 00 01 00 00
6a 11 2d 1a 00 01 00 00  ..q8....j.-.....
[Wed May 10 16:03:06 2023]  front: 00000080: 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 00000090: ee 01 00 00 00 00 00 00
01 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000a0: 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000b0: 01 09 00 00 00 00 00 00
00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000c0: 01 00 00 00 00 00 00 00
02 09 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000d0: 05 00 00 00 00 00 00 00
01 09 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000e0: ff 08 00 00 00 00 00 00
fd 08 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000f0: fb 08 00 00 00 00 00 00
f9 08 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023] footer: 00000000: ca 39 06 07 00 00 00 00
00 00 00 00 42 06 63 61  .9..........B.ca
[Wed May 10 16:03:06 2023] footer: 00000010: 7b 4b 5d 2d
05                                   {K]-.


And if the split_realm number equals to sizeof(ceph_mds_snap_realm) +
extra snap buffer size by coincidence, the above 'corrupted' snaptrace
will be parsed by kclient too and kclient won't give any warning, but
it will corrupted the snaprealm and capsnap info in kclient.


Thanks

- Xiubo


Thanks,

                  Ilya


--
Cheers,
Venky






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux