Re: [PATCH] ceph: force updating the msg pointer in non-split case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 17, 2023 at 3:33 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
>
>
> On 5/17/23 21:11, Ilya Dryomov wrote:
> > On Wed, May 17, 2023 at 2:46 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >>
> >> On 5/17/23 19:29, Ilya Dryomov wrote:
> >>> On Wed, May 17, 2023 at 1:04 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >>>> On 5/17/23 18:31, Ilya Dryomov wrote:
> >>>>> On Wed, May 17, 2023 at 7:24 AM <xiubli@xxxxxxxxxx> wrote:
> >>>>>> From: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>>>>
> >>>>>> When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the
> >>>>>> request may still contain a list of 'split_realms', and we need
> >>>>>> to skip it anyway. Or it will be parsed as a corrupt snaptrace.
> >>>>>>
> >>>>>> Cc: stable@xxxxxxxxxxxxxxx
> >>>>>> Cc: Frank Schilder <frans@xxxxxx>
> >>>>>> Reported-by: Frank Schilder <frans@xxxxxx>
> >>>>>> URL: https://tracker.ceph.com/issues/61200
> >>>>>> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> >>>>>> ---
> >>>>>>     fs/ceph/snap.c | 3 +++
> >>>>>>     1 file changed, 3 insertions(+)
> >>>>>>
> >>>>>> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
> >>>>>> index 0e59e95a96d9..d95dfe16b624 100644
> >>>>>> --- a/fs/ceph/snap.c
> >>>>>> +++ b/fs/ceph/snap.c
> >>>>>> @@ -1114,6 +1114,9 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc,
> >>>>>>                                    continue;
> >>>>>>                            adjust_snap_realm_parent(mdsc, child, realm->ino);
> >>>>>>                    }
> >>>>>> +       } else {
> >>>>>> +               p += sizeof(u64) * num_split_inos;
> >>>>>> +               p += sizeof(u64) * num_split_realms;
> >>>>>>            }
> >>>>>>
> >>>>>>            /*
> >>>>>> --
> >>>>>> 2.40.1
> >>>>>>
> >>>>> Hi Xiubo,
> >>>>>
> >>>>> This code appears to be very old -- it goes back to the initial commit
> >>>>> 963b61eb041e ("ceph: snapshot management") in 2009.  Do you have an
> >>>>> explanation for why this popped up only now?
> >>>> As I remembered we hit this before in one cu BZ last year, but I
> >>>> couldn't remember exactly which one.  But I am not sure whether @Jeff
> >>>> saw this before I joint ceph team.
> >>>>
> >>>>
> >>>>> Has MDS always been including split_inos and split_realms arrays in
> >>>>> !CEPH_SNAP_OP_SPLIT case or is this a recent change?  If it's a recent
> >>>>> change, I'd argue that this needs to be addressed on the MDS side.
> >>>> While in MDS side for the _UPDATE op it won't send the 'split_realm'
> >>>> list just before the commit in 2017:
> >>>>
> >>>> commit 93e7267757508520dfc22cff1ab20558bd4a44d4
> >>>> Author: Yan, Zheng <zyan@xxxxxxxxxx>
> >>>> Date:   Fri Jul 21 21:40:46 2017 +0800
> >>>>
> >>>>        mds: send snap related messages centrally during mds recovery
> >>>>
> >>>>        sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to
> >>>>        clients centrally in MDCache::open_snaprealms()
> >>>>
> >>>>        Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx>
> >>>>
> >>>> Before this commit it will only send the 'split_realm' list for the
> >>>> _SPLIT op.
> >>> It sounds like we have the culprit.  This should be treated as
> >>> a regression and fixed on the MDS side.  I don't see a justification
> >>> for putting useless data on the wire.
> >> But we still need this patch to make it to work with the old ceph releases.
> > This is debatable:
> >
> > - given that no one noticed this for so long, the likelihood of MDS
> >    sending a CEPH_SNAP_OP_UPDATE message with bogus split_inos and
> >    split_realms arrays is very low
> >
> > - MDS side fix would be backported to supported Ceph releases
> >
> > - people who are running unsupported Ceph releases (i.e. aren't
> >    updating) are unlikely to be diligently updating their kernel clients
>
> Just searched the ceph tracker and I found another 3 trackers have the
> same issue:
>
> https://tracker.ceph.com/issues/57817
> https://tracker.ceph.com/issues/57703
> https://tracker.ceph.com/issues/57686
>
> So plusing this time and the previous CU case:
>
> https://www.spinics.net/lists/ceph-users/msg77106.html
>
> I have seen at least 5 times.
>
> All this are reproduced when doing MDS failover, and this is the root
> cause in MDS side.

OK, given that the fixup in the kernel client is small, it seems
justified.  But, please, add a comment in the new else branch saying
that it's there only to work around a bug in the MDS.

Thanks,

                Ilya




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux