On Wed, May 17, 2023 at 3:33 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > > > On 5/17/23 21:11, Ilya Dryomov wrote: > > On Wed, May 17, 2023 at 2:46 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > >> > >> On 5/17/23 19:29, Ilya Dryomov wrote: > >>> On Wed, May 17, 2023 at 1:04 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > >>>> On 5/17/23 18:31, Ilya Dryomov wrote: > >>>>> On Wed, May 17, 2023 at 7:24 AM <xiubli@xxxxxxxxxx> wrote: > >>>>>> From: Xiubo Li <xiubli@xxxxxxxxxx> > >>>>>> > >>>>>> When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the > >>>>>> request may still contain a list of 'split_realms', and we need > >>>>>> to skip it anyway. Or it will be parsed as a corrupt snaptrace. > >>>>>> > >>>>>> Cc: stable@xxxxxxxxxxxxxxx > >>>>>> Cc: Frank Schilder <frans@xxxxxx> > >>>>>> Reported-by: Frank Schilder <frans@xxxxxx> > >>>>>> URL: https://tracker.ceph.com/issues/61200 > >>>>>> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > >>>>>> --- > >>>>>> fs/ceph/snap.c | 3 +++ > >>>>>> 1 file changed, 3 insertions(+) > >>>>>> > >>>>>> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c > >>>>>> index 0e59e95a96d9..d95dfe16b624 100644 > >>>>>> --- a/fs/ceph/snap.c > >>>>>> +++ b/fs/ceph/snap.c > >>>>>> @@ -1114,6 +1114,9 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, > >>>>>> continue; > >>>>>> adjust_snap_realm_parent(mdsc, child, realm->ino); > >>>>>> } > >>>>>> + } else { > >>>>>> + p += sizeof(u64) * num_split_inos; > >>>>>> + p += sizeof(u64) * num_split_realms; > >>>>>> } > >>>>>> > >>>>>> /* > >>>>>> -- > >>>>>> 2.40.1 > >>>>>> > >>>>> Hi Xiubo, > >>>>> > >>>>> This code appears to be very old -- it goes back to the initial commit > >>>>> 963b61eb041e ("ceph: snapshot management") in 2009. Do you have an > >>>>> explanation for why this popped up only now? > >>>> As I remembered we hit this before in one cu BZ last year, but I > >>>> couldn't remember exactly which one. But I am not sure whether @Jeff > >>>> saw this before I joint ceph team. > >>>> > >>>> > >>>>> Has MDS always been including split_inos and split_realms arrays in > >>>>> !CEPH_SNAP_OP_SPLIT case or is this a recent change? If it's a recent > >>>>> change, I'd argue that this needs to be addressed on the MDS side. > >>>> While in MDS side for the _UPDATE op it won't send the 'split_realm' > >>>> list just before the commit in 2017: > >>>> > >>>> commit 93e7267757508520dfc22cff1ab20558bd4a44d4 > >>>> Author: Yan, Zheng <zyan@xxxxxxxxxx> > >>>> Date: Fri Jul 21 21:40:46 2017 +0800 > >>>> > >>>> mds: send snap related messages centrally during mds recovery > >>>> > >>>> sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to > >>>> clients centrally in MDCache::open_snaprealms() > >>>> > >>>> Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx> > >>>> > >>>> Before this commit it will only send the 'split_realm' list for the > >>>> _SPLIT op. > >>> It sounds like we have the culprit. This should be treated as > >>> a regression and fixed on the MDS side. I don't see a justification > >>> for putting useless data on the wire. > >> But we still need this patch to make it to work with the old ceph releases. > > This is debatable: > > > > - given that no one noticed this for so long, the likelihood of MDS > > sending a CEPH_SNAP_OP_UPDATE message with bogus split_inos and > > split_realms arrays is very low > > > > - MDS side fix would be backported to supported Ceph releases > > > > - people who are running unsupported Ceph releases (i.e. aren't > > updating) are unlikely to be diligently updating their kernel clients > > Just searched the ceph tracker and I found another 3 trackers have the > same issue: > > https://tracker.ceph.com/issues/57817 > https://tracker.ceph.com/issues/57703 > https://tracker.ceph.com/issues/57686 > > So plusing this time and the previous CU case: > > https://www.spinics.net/lists/ceph-users/msg77106.html > > I have seen at least 5 times. > > All this are reproduced when doing MDS failover, and this is the root > cause in MDS side. OK, given that the fixup in the kernel client is small, it seems justified. But, please, add a comment in the new else branch saying that it's there only to work around a bug in the MDS. Thanks, Ilya