Re: [PATCH] ceph: force updating the msg pointer in non-split case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Xiubo,

On Wed, May 17, 2023 at 4:45 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
>
>
> On 5/17/23 19:04, Xiubo Li wrote:
> >
> > On 5/17/23 18:31, Ilya Dryomov wrote:
> >> On Wed, May 17, 2023 at 7:24 AM <xiubli@xxxxxxxxxx> wrote:
> >>> From: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>
> >>> When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the
> >>> request may still contain a list of 'split_realms', and we need
> >>> to skip it anyway. Or it will be parsed as a corrupt snaptrace.
> >>>
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Cc: Frank Schilder <frans@xxxxxx>
> >>> Reported-by: Frank Schilder <frans@xxxxxx>
> >>> URL: https://tracker.ceph.com/issues/61200
> >>> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> >>> ---
> >>>   fs/ceph/snap.c | 3 +++
> >>>   1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
> >>> index 0e59e95a96d9..d95dfe16b624 100644
> >>> --- a/fs/ceph/snap.c
> >>> +++ b/fs/ceph/snap.c
> >>> @@ -1114,6 +1114,9 @@ void ceph_handle_snap(struct ceph_mds_client
> >>> *mdsc,
> >>>                                  continue;
> >>>                          adjust_snap_realm_parent(mdsc, child,
> >>> realm->ino);
> >>>                  }
> >>> +       } else {
> >>> +               p += sizeof(u64) * num_split_inos;
> >>> +               p += sizeof(u64) * num_split_realms;
> >>>          }
> >>>
> >>>          /*
> >>> --
> >>> 2.40.1
> >>>
> >> Hi Xiubo,
> >>
> >> This code appears to be very old -- it goes back to the initial commit
> >> 963b61eb041e ("ceph: snapshot management") in 2009.  Do you have an
> >> explanation for why this popped up only now?
> >
> > As I remembered we hit this before in one cu BZ last year, but I
> > couldn't remember exactly which one.  But I am not sure whether @Jeff
> > saw this before I joint ceph team.
> >
> @Venky,
>
> Do you remember which one ? As I remembered this is why we fixed the
> snaptrace issue by blocking all the IOs and at the same time
> blocklisting the kclient before.
>
> Before the kcleint won't dump the corrupted msg and we don't know what
> was wrong with the msg and also we added code to dump the msg in the
> above fix.

The "corrupted" snaptrace issue happened just after the mds asserted
hitting the metadata corruption (dentry first corrupted) and it
_seemed_ that this corruption somehow triggered a corrupted snaptrace
to be sent to the client.

>
> Thanks
>
> - Xiubo
>
> >
> >> Has MDS always been including split_inos and split_realms arrays in
> >> !CEPH_SNAP_OP_SPLIT case or is this a recent change?  If it's a recent
> >> change, I'd argue that this needs to be addressed on the MDS side.
> >
> > While in MDS side for the _UPDATE op it won't send the 'split_realm'
> > list just before the commit in 2017:
> >
> > commit 93e7267757508520dfc22cff1ab20558bd4a44d4
> > Author: Yan, Zheng <zyan@xxxxxxxxxx>
> > Date:   Fri Jul 21 21:40:46 2017 +0800
> >
> >     mds: send snap related messages centrally during mds recovery
> >
> >     sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to
> >     clients centrally in MDCache::open_snaprealms()
> >
> >     Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx>
> >
> > Before this commit it will only send the 'split_realm' list for the
> > _SPLIT op.
> >
> >
> > The following the snaptrace:
> >
> > [Wed May 10 16:03:06 2023] header: 00000000: 05 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023] header: 00000010: 12 03 7f 00 01 00 00 01
> > 00 00 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023] header: 00000020: 00 00 00 00 02 01 00 00
> > 00 00 00 00 00 01 00 00  ................
> > [Wed May 10 16:03:06 2023] header: 00000030: 00 98 0d 60
> > 93                                   ...`.
> > [Wed May 10 16:03:06 2023]  front: 00000000: 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00  ................ <<== The op is 0, which is
> > 'CEPH_SNAP_OP_UPDATE'
> > [Wed May 10 16:03:06 2023]  front: 00000010: 0c 00 00 00 88 00 00 00
> > d1 c0 71 38 00 01 00 00  ..........q8....           <<== The '0c' is
> > the split_realm number
> > [Wed May 10 16:03:06 2023]  front: 00000020: 22 c8 71 38 00 01 00 00
> > d7 c7 71 38 00 01 00 00  ".q8......q8....       <<== All the 'q8' are
> > the ino#
> > [Wed May 10 16:03:06 2023]  front: 00000030: d9 c7 71 38 00 01 00 00
> > d4 c7 71 38 00 01 00 00  ..q8......q8....
> > [Wed May 10 16:03:06 2023]  front: 00000040: f1 c0 71 38 00 01 00 00
> > d4 c0 71 38 00 01 00 00  ..q8......q8....
> > [Wed May 10 16:03:06 2023]  front: 00000050: 20 c8 71 38 00 01 00 00
> > 1d c8 71 38 00 01 00 00   .q8......q8....
> > [Wed May 10 16:03:06 2023]  front: 00000060: ec c0 71 38 00 01 00 00
> > d6 c0 71 38 00 01 00 00  ..q8......q8....
> > [Wed May 10 16:03:06 2023]  front: 00000070: ef c0 71 38 00 01 00 00
> > 6a 11 2d 1a 00 01 00 00  ..q8....j.-.....
> > [Wed May 10 16:03:06 2023]  front: 00000080: 01 00 00 00 00 00 00 00
> > 01 00 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 00000090: ee 01 00 00 00 00 00 00
> > 01 00 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 000000a0: 00 00 00 00 00 00 00 00
> > 01 00 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 000000b0: 01 09 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 000000c0: 01 00 00 00 00 00 00 00
> > 02 09 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 000000d0: 05 00 00 00 00 00 00 00
> > 01 09 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 000000e0: ff 08 00 00 00 00 00 00
> > fd 08 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023]  front: 000000f0: fb 08 00 00 00 00 00 00
> > f9 08 00 00 00 00 00 00  ................
> > [Wed May 10 16:03:06 2023] footer: 00000000: ca 39 06 07 00 00 00 00
> > 00 00 00 00 42 06 63 61  .9..........B.ca
> > [Wed May 10 16:03:06 2023] footer: 00000010: 7b 4b 5d 2d
> > 05                                   {K]-.
> >
> >
> > And if the split_realm number equals to sizeof(ceph_mds_snap_realm) +
> > extra snap buffer size by coincidence, the above 'corrupted' snaptrace
> > will be parsed by kclient too and kclient won't give any warning, but
> > it will corrupted the snaprealm and capsnap info in kclient.
> >
> >
> > Thanks
> >
> > - Xiubo
> >
> >
> >> Thanks,
> >>
> >>                  Ilya
> >>
>


-- 
Cheers,
Venky





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux