Re: mds dump inode crashes file system

Frank Schilder <frans@xxxxxx> · Mon, 15 May 2023 16:33:29 +0000

Dear Xiubo,

I uploaded the cache dump, the MDS log and the dmesg log containing the snaptrace dump to

ceph-post-file: 763955a3-7d37-408a-bbe4-a95dc687cd3f

Sorry, I forgot to add user and description this time.

A question about trouble shooting. I'm pretty sure I know the path where the error is located. Would a "ceph tell mds.1 scrub start / recursive repair" be able to discover and fix broken snaptraces? If not I'm awaiting further instructions.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Xiubo Li <xiubli@xxxxxxxxxx>
Sent: Friday, May 12, 2023 3:44 PM
To: Frank Schilder; ceph-users@xxxxxxx
Subject: Re:  Re: mds dump inode crashes file system

On 5/12/23 20:27, Frank Schilder wrote:
> Dear Xiubo and others.
>
>>> I have never heard about that option until now. How do I check that and how to I disable it if necessary?
>>> I'm in meetings pretty much all day and will try to send some more info later.
>> $ mount|grep ceph
> I get
>
> MON-IPs:SRC on DST type ceph (rw,relatime,name=con-fs2-rit-pfile,secret=<hidden>,noshare,acl,mds_namespace=con-fs2,_netdev)
>
> so async dirop seems disabled.

Yeah.

>> Yeah, the kclient just received a corrupted snaptrace from MDS.
>> So the first thing is you need to fix the corrupted snaptrace issue in cephfs and then continue.
> Ooookaaayyyy. I will take it as a compliment that you seem to assume I know how to do that. The documentation gives 0 hits. Could you please provide me with instructions of what to look for and/or what to do first?

There is no doc about this as I know.

>> If possible you can parse the above corrupted snap message to check what exactly corrupted.
>> I haven't get a chance to do that.
> Again, how would I do that? Is there some documentation and what should I expect?

Currently there is no easy way to do this as I know, last time I have
parsed the corrupted binary data to the corresponding message manully.

And then we could know what exactly has happened for the snaptrace.

>> You seems didn't enable the 'osd blocklist' cephx auth cap for mon:
> I can't find anything about an osd blocklist client auth cap in the documentation. Is this something that came after octopus? Our caps are as shown in the documentation for a ceph fs client (https://docs.ceph.com/en/octopus/cephfs/client-auth/), the one for mon is "allow r":
>
>          caps mds = "allow rw path=/shares"
>          caps mon = "allow r"
>          caps osd = "allow rw tag cephfs data=con-fs2"
Yeah, it seems the 'osd blocklist' was disabled. As I remembered if
enabled it should be something likes:

caps mon = "allow r, allow command \"osd blocklist\""

>
>> I checked that but by reading the code I couldn't get what had cause the MDS crash.
>> There seems something wrong corrupt the metadata in cephfs.
> He wrote something about an invalid xattrib (empty value). It would be really helpful to get a clue how to proceed. I managed to dump the MDS cache with the critical inode in cache. Would this help with debugging? I also managed to get debug logs with debug_mds=20 during a crash caused by an "mds dump inode" command. Would this contain something interesting? I can also pull the rados objects out and can upload all of these files.

Yeah, possibly. Where is the logs ?

> I managed to track the problem down to a specific folder with a few files (I'm not sure if this coincides with the snaptrace issue, we might have 2 issues here). I made a copy of the folder and checked that an "mds dump inode" for the copy does not crash the MDS. I then moved the folders for which this command causes a crash to a different location outside the mounts. Do you think this will help? I'm wondering if after taking our daily snapshot tomorrow we end up in the degraded situation again.
>
> I really need instructions for how to check what is broken without an MDS crash and then how to fix it.

Firstly we need to know where the corrupted metadata is.

I think the mds debug logs and the above corrupted snaptrace could help.
Need to parse that corrupted binary data.

Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx