Dear Gregory, > I would start by looking at what xattrs exist and if there's an obvious bad > one, deleting it. I can't see any obvious bad ones and I also can't just delete them, they are required for ACLs. I'm not convinced that one of the xattrs that can be dumped with 'getfattr -d -m ".*"' are the culprit, they all look fine: # getfattr -d -m ".*" /mnt/cephfs/shares/rit-oil/Projects/CSP/Chalk/CSP1.A.03/99_Personal\ folders/Eugenio/Tests/Eclipse/19_imbLab/19_IMBLAB.EGRID getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/shares/rit-oil/Projects/CSP/Chalk/CSP1.A.03/99_Personal folders/Eugenio/Tests/Eclipse/19_imbLab/19_IMBLAB.EGRID security.NTACL=encoded-data-removed security.selinux="system_u:object_r:cephfs_t:s0" system.posix_acl_access=encoded-data-removed user.DOSATTRIB=encoded-data-removed user.SAMBA_PAI=encoded-data-removed How can I inspect the file object including all hidden xattrs, for example, all the ceph.-xattrs? There ought to be some rados+decode way of doing that. Would the attrib name be in the OPS list dumped on MDS crash? I would be grateful for any pointer you can provide. Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Frank Schilder <frans@xxxxxx> Sent: Wednesday, May 10, 2023 6:18 PM To: Gregory Farnum Cc: ceph-users@xxxxxxx Subject: Re: mds dump inode crashes file system Hi Gregory, using the more complicated rados way, I found the path. I assume you are referring to attribs I can read with getfattr. The output of a dump is: # getfattr -d /mnt/cephfs/shares/rit-oil/Projects/CSP/Chalk/CSP1.A.03/99_Personal\ folders/Eugenio/Tests/Eclipse/19_imbLab/19_IMBLAB.EGRID getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/shares/rit-oil/Projects/CSP/Chalk/CSP1.A.03/99_Personal folders/Eugenio/Tests/Eclipse/19_imbLab/19_IMBLAB.EGRID user.DOSATTRIB=0sAAAFAAUAAAARAAAAIAAAAIfMCneZfdkB user.SAMBA_PAI=0sAgSEDwAAAAABgYYeAAAAjLcxAAAC/////wABgYYeAAAAjLcxABAAlFExABABlFExABAALPgoABABLPgoABAADEUvABABDEUvABAAllExABABllExABAAE9AqABABE9AqAA== # An empty line is part of the output. These look all right to me. Can you tell me what I should look at? I will probably reply tomorrow, my time for today is almost up. Thanks for your help and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Gregory Farnum <gfarnum@xxxxxxxxxx> Sent: Wednesday, May 10, 2023 4:37 PM To: Frank Schilder Cc: ceph-users@xxxxxxx Subject: Re: Re: mds dump inode crashes file system On Wed, May 10, 2023 at 7:33 AM Frank Schilder <frans@xxxxxx> wrote: > > Hi Gregory, > > thanks for your reply. Yes, I forgot, I can also inspect the rados head object. My bad. > > The empty xattr might come from a crash of the SAMBA daemon. We export to windows and this uses xattrs extensively to map to windows ACLs. It might be possible that a crash at an inconvenient moment left an object in this state. Do you think this is possible? Would it be possible to repair that? I'm still a little puzzled that it's possible for the system to get into this state, so we probably will need to generate some bugfixes. And it might just be the dump function is being naughty. But I would start by looking at what xattrs exist and if there's an obvious bad one, deleting it. -Greg > > I will report back what I find with the low-level access. Need to head home now ... > > Thanks and best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Gregory Farnum <gfarnum@xxxxxxxxxx> > Sent: Wednesday, May 10, 2023 4:26 PM > To: Frank Schilder > Cc: ceph-users@xxxxxxx > Subject: Re: Re: mds dump inode crashes file system > > This is a very strange assert to be hitting. From a code skim my best > guess is the inode somehow has an xattr with no value, but that's just > a guess and I've no idea how it would happen. > Somebody recently pointed you at the (more complicated) way of > identifying an inode path by looking at its RADOS object and grabbing > the backtrace, which ought to let you look at the file in-situ. > -Greg > > > On Wed, May 10, 2023 at 6:37 AM Frank Schilder <frans@xxxxxx> wrote: > > > > For the "mds dump inode" command I could find the crash in the log; see below. Most of the log contents is the past OPS dump from the 3 MDS restarts that happened. It contains the 10000 last OPS before the crash and I can upload the log if someone can use it. The crash stack trace somewhat truncated for readability: > > > > 2023-05-10T12:54:53.142+0200 7fe971ca6700 1 mds.ceph-23 Updating MDS map to version 892464 from mon.4 > > 2023-05-10T13:39:50.962+0200 7fe96fca2700 0 log_channel(cluster) log [WRN] : client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb pending pAsLsXsFscr issued pAsLsXsFscr, sent 61.705410 seconds ago > > 2023-05-10T13:39:52.550+0200 7fe971ca6700 1 mds.ceph-23 Updating MDS map to version 892465 from mon.4 > > 2023-05-10T13:40:50.963+0200 7fe96fca2700 0 log_channel(cluster) log [WRN] : client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb pending pAsLsXsFscr issued pAsLsXsFscr, sent 121.706193 seconds ago > > 2023-05-10T13:42:50.966+0200 7fe96fca2700 0 log_channel(cluster) log [WRN] : client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb pending pAsLsXsFscr issued pAsLsXsFscr, sent 241.709072 seconds ago > > 2023-05-10T13:44:00.506+0200 7fe972ca8700 1 mds.ceph-23 asok_command: dump inode {number=2199322355147,prefix=dump inode} (starting...) > > 2023-05-10T13:44:00.520+0200 7fe972ca8700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/common/buffer.cc: In function 'const char* ceph::buffer::v15_2_0::ptr::c_str() const' thread 7fe972ca8700 time 2023-05-10T13:44:00.507652+0200 > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/common/buffer.cc: 501: FAILED ceph_assert(_raw) > > > > ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable) > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7fe979ae9b92] > > 2: (()+0x27ddac) [0x7fe979ae9dac] > > 3: (()+0x5ce831) [0x7fe979e3a831] > > 4: (InodeStoreBase::dump(ceph::Formatter*) const+0x153) [0x55c08c59b543] > > 5: (CInode::dump(ceph::Formatter*, int) const+0x144) [0x55c08c59b8d4] > > 6: (MDCache::dump_inode(ceph::Formatter*, unsigned long)+0x7c) [0x55c08c41e00c] > > 7: (MDSRank::command_dump_inode(ceph::Formatter*, ..., std::ostream&)+0xb5) [0x55c08c353e75] > > 8: (MDSRankDispatcher::handle_asok_command(std::basic_string_view<char, std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x2296) [0x55c08c36c5f6] > > 9: (MDSDaemon::asok_command(std::basic_string_view<char, ..., ceph::buffer::v15_2_0::list&)>)+0x75b) [0x55c08c340eab] > > 10: (MDSSocketHook::call_async(std::basic_string_view<char, std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x6a) [0x55c08c34f9ca] > > 11: (AdminSocket::execute_command(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, ..., ceph::buffer::v15_2_0::list&)>)+0x6f9) [0x7fe979bece59] > > 12: (AdminSocket::do_tell_queue()+0x289) [0x7fe979bed809] > > 13: (AdminSocket::entry()+0x4d3) [0x7fe979beefd3] > > 14: (()+0xc2ba3) [0x7fe977afaba3] > > 15: (()+0x81ca) [0x7fe9786bf1ca] > > 16: (clone()+0x43) [0x7fe977111dd3] > > > > 2023-05-10T13:44:00.522+0200 7fe972ca8700 -1 *** Caught signal (Aborted) ** > > in thread 7fe972ca8700 thread_name:admin_socket > > > > ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable) > > 1: (()+0x12ce0) [0x7fe9786c9ce0] > > 2: (gsignal()+0x10f) [0x7fe977126a9f] > > 3: (abort()+0x127) [0x7fe9770f9e05] > > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7fe979ae9be3] > > 5: (()+0x27ddac) [0x7fe979ae9dac] > > 6: (()+0x5ce831) [0x7fe979e3a831] > > 7: (InodeStoreBase::dump(ceph::Formatter*) const+0x153) [0x55c08c59b543] > > 8: (CInode::dump(ceph::Formatter*, int) const+0x144) [0x55c08c59b8d4] > > 9: (MDCache::dump_inode(ceph::Formatter*, unsigned long)+0x7c) [0x55c08c41e00c] > > 10: (MDSRank::command_dump_inode(ceph::Formatter*, ..., std::ostream&)+0xb5) [0x55c08c353e75] > > 11: (MDSRankDispatcher::handle_asok_command(std::basic_string_view<char, std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x2296) [0x55c08c36c5f6] > > 12: (MDSDaemon::asok_command(std::basic_string_view<char, std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x75b) [0x55c08c340eab] > > 13: (MDSSocketHook::call_async(std::basic_string_view<char, std::char_traits<char> >, ..., ceph::buffer::v15_2_0::list&)>)+0x6a) [0x55c08c34f9ca] > > 14: (AdminSocket::execute_command(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, ..., ceph::buffer::v15_2_0::list&)>)+0x6f9) [0x7fe979bece59] > > 15: (AdminSocket::do_tell_queue()+0x289) [0x7fe979bed809] > > 16: (AdminSocket::entry()+0x4d3) [0x7fe979beefd3] > > 17: (()+0xc2ba3) [0x7fe977afaba3] > > 18: (()+0x81ca) [0x7fe9786bf1ca] > > 19: (clone()+0x43) [0x7fe977111dd3] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > > > > Best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > ________________________________________ > > From: Frank Schilder <frans@xxxxxx> > > Sent: Wednesday, May 10, 2023 2:33 PM > > To: ceph-users@xxxxxxx > > Subject: mds dump inode crashes file system > > > > Hi all, > > > > I have an annoying problem with a specific ceph fs client. We have a file server on which we re-export kernel mounts via samba (all mounts with noshare option). On one of these re-exports we have recurring problems. Today I caught it with > > > > 2023-05-10T13:39:50.963685+0200 mds.ceph-23 (mds.1) 1761 : cluster [WRN] client.205899841 isn't responding to mclientcaps(revoke), ino 0x20011d3e5cb pending pAsLsXsFscr issued pAsLsXsFscr, sent 61.705410 seconds ago > > > > and I wanted to look up what path the inode 0x20011d3e5cb points to. Unfortunately, the command > > > > ceph tell "mds.*" dump inode 0x20011d3e5cb > > > > crashes an MDS in a way that it restarts itself, but doesn't seem to come back clean (it does not fail over to a stand-by). If I repeat the command above, it crashes the MDS again. Execution on other MDS daemons succeeds, for example: > > > > # ceph tell "mds.ceph-24" dump inode 0x20011d3e5cb > > 2023-05-10T14:14:37.091+0200 7fa47ffff700 0 client.210149523 ms_handle_reset on v2:192.168.32.88:6800/3216233914 > > 2023-05-10T14:14:37.124+0200 7fa4857fa700 0 client.210374440 ms_handle_reset on v2:192.168.32.88:6800/3216233914 > > dump inode failed, wrong inode number or the inode is not cached > > > > The caps recall gets the client evicted at some point but it doesn't manage to come back clean. On a single ceph fs mount point I see this > > > > # ls /shares/samba/rit-oil > > ls: cannot access '/shares/samba/rit-oil': Stale file handle > > > > All other mount points are fine, just this one acts up. A "mount -o remount /shares/samba/rit-oil" crashed the entire server and I had to do a cold reboot. On reboot I see this message: https://imgur.com/a/bOSLxBb , which only occurs on this one file server (we are running a few of those). Does this point to a more serious problem, like a file system corruption? Should I try an fs scrub on the corresponding path? > > > > Some info about the system: > > > > The file server's kernel version is quite recent, updated two weeks ago: > > > > $ uname -r > > 4.18.0-486.el8.x86_64 > > # cat /etc/redhat-release > > CentOS Stream release 8 > > > > Our ceph cluster is octopus latest and we use the packages from the octopus el8 repo on this server. > > > > We have several such shares and they all work fine. It is only on one share where we have persistent problems with the mount point hanging or the server freezing and crashing. > > > > After working hours I will try a proper fail of the "broken" MDS to see if I can execute the dump inode command without it crashing everything. > > > > In the mean time, any hints would be appreciated. I see that we have an exceptionally large MDS log for the problematic one. Any hint what to look for would be appreciated, it contains a lot from the recovery operations: > > > > # pdsh -w ceph-[08-17,23-24] ls -lh "/var/log/ceph/ceph-mds.ceph-??.log" > > > > ceph-23: -rw-r--r--. 1 ceph ceph 15M May 10 14:28 /var/log/ceph/ceph-mds.ceph-23.log *** huge *** > > > > ceph-24: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-24.log > > ceph-10: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-10.log > > ceph-13: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-13.log > > ceph-08: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-08.log > > ceph-15: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-15.log > > ceph-17: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-17.log > > ceph-14: -rw-r--r--. 1 ceph ceph 16K May 10 14:28 /var/log/ceph/ceph-mds.ceph-14.log > > ceph-09: -rw-r--r--. 1 ceph ceph 16K May 10 14:28 /var/log/ceph/ceph-mds.ceph-09.log > > ceph-16: -rw-r--r--. 1 ceph ceph 15K May 10 14:28 /var/log/ceph/ceph-mds.ceph-16.log > > ceph-11: -rw-r--r--. 1 ceph ceph 14K May 10 14:28 /var/log/ceph/ceph-mds.ceph-11.log > > ceph-12: -rw-r--r--. 1 ceph ceph 394 May 10 14:02 /var/log/ceph/ceph-mds.ceph-12.log > > > > Thanks and best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx