On 5/17/23 21:11, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 2:46 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
On 5/17/23 19:29, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 1:04 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
On 5/17/23 18:31, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 7:24 AM <xiubli@xxxxxxxxxx> wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>
When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the
request may still contain a list of 'split_realms', and we need
to skip it anyway. Or it will be parsed as a corrupt snaptrace.
Cc: stable@xxxxxxxxxxxxxxx
Cc: Frank Schilder <frans@xxxxxx>
Reported-by: Frank Schilder <frans@xxxxxx>
URL: https://tracker.ceph.com/issues/61200
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
fs/ceph/snap.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index 0e59e95a96d9..d95dfe16b624 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -1114,6 +1114,9 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc,
continue;
adjust_snap_realm_parent(mdsc, child, realm->ino);
}
+ } else {
+ p += sizeof(u64) * num_split_inos;
+ p += sizeof(u64) * num_split_realms;
}
/*
--
2.40.1
Hi Xiubo,
This code appears to be very old -- it goes back to the initial commit
963b61eb041e ("ceph: snapshot management") in 2009. Do you have an
explanation for why this popped up only now?
As I remembered we hit this before in one cu BZ last year, but I
couldn't remember exactly which one. But I am not sure whether @Jeff
saw this before I joint ceph team.
Has MDS always been including split_inos and split_realms arrays in
!CEPH_SNAP_OP_SPLIT case or is this a recent change? If it's a recent
change, I'd argue that this needs to be addressed on the MDS side.
While in MDS side for the _UPDATE op it won't send the 'split_realm'
list just before the commit in 2017:
commit 93e7267757508520dfc22cff1ab20558bd4a44d4
Author: Yan, Zheng <zyan@xxxxxxxxxx>
Date: Fri Jul 21 21:40:46 2017 +0800
mds: send snap related messages centrally during mds recovery
sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to
clients centrally in MDCache::open_snaprealms()
Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx>
Before this commit it will only send the 'split_realm' list for the
_SPLIT op.
It sounds like we have the culprit. This should be treated as
a regression and fixed on the MDS side. I don't see a justification
for putting useless data on the wire.
But we still need this patch to make it to work with the old ceph releases.
This is debatable:
- given that no one noticed this for so long, the likelihood of MDS
sending a CEPH_SNAP_OP_UPDATE message with bogus split_inos and
split_realms arrays is very low
- MDS side fix would be backported to supported Ceph releases
- people who are running unsupported Ceph releases (i.e. aren't
updating) are unlikely to be diligently updating their kernel clients
Just searched the ceph tracker and I found another 3 trackers have the
same issue:
https://tracker.ceph.com/issues/57817
https://tracker.ceph.com/issues/57703
https://tracker.ceph.com/issues/57686
So plusing this time and the previous CU case:
https://www.spinics.net/lists/ceph-users/msg77106.html
I have seen at least 5 times.
All this are reproduced when doing MDS failover, and this is the root
cause in MDS side.
Thanks
- Xiubo
Thanks,
Ilya