Hi, last weekend we upgraded one of our clusters from 16.2.5 to 16.2.7 using cephadm. The upgrade itself seemed to run without a problem but shortly after the upgrade we noticed the servers holding the MDS containers being laggy, then unresponsive, then crashing outright due getting reaped by the OOM killer. The MDS would restart, at some point resulting in a reset of the whole machine. At this point the cluster status started showing a lot of PGs with snaptrim/snaptrim_wait status, so we waited for those to finish with the MDS shutdown. So far everything we've tried changed nothing about our situation - our MDS will try to hog extreme amounts of memory they never needed before. Even when we push the cache down to 1G or reset it to the default the will report as having oversized caches, using up 150+GB of RAM+swap until resource exhaustion makes the OOM killer reap them again. Looking at the output of ceph fs status and the logs it seems it's trying to push too much into the cache. The MDS crashes at around 25-30M inodes and a few thousand dirs. ceph fs status reports the MDS as being active, unlike a few other posts with similar issues the MDS does not seem to be stuck in replay. The journal reports no issues. Additionally perf reports no strays, nor does listomapvals. The cluster contains ~33TB in ~220M files. There are multiple snapshots for several folders. While ceph fs status reports the MDS as active, I am not able to mount the filesystem. I will probably recreate/resync this soon but I'd still like to find out why this happened since we plan on using Ceph in a few other applications but at the moment I wouldn't feel comfortable using CephFS given that a single cephadm upgrade command made these amounts of data unavailable. Below are a few lines of log that basically make up the MDS' logs until it runs into "mds.0.purge_queue push: pushing inode" and shortly after that the crash. Any pointers to what went wrong here, how to debug this further or how to fix this would be greatly appreciated. I'll probably have the broken system for another day or two before resetting/recreating the FS. Thanks, Izzy Kulbe MDS Log: 20 mds.0.cache.dir(0x604.001111000*) lookup_exact_snap (head, '10000eff917') 10 mds.0.cache.snaprealm(0x10000eff917 seq 3398 0x55c541856200) adjust_parent 0 -> 0x55ba9f1e2e00 12 mds.0.cache.dir(0x604.001111000*) add_primary_dentry [dentry #0x100/stray4/10000eff917 [d89,head] auth (dversion lock) pv=0 v=38664524 ino=0x10000eff917 state=1073741824 0x55c541854a00] 12 mds.0.cache.dir(0x604.001111000*) _fetched got [dentry #0x100/stray4/10000eff917 [d89,head] auth (dversion lock) pv=0 v=38664524 ino=0x10000eff917 state=1073741824 0x55c541854a00] [inode 0x10000eff917 [...d89,head] ~mds0/stray4/10000eff917/ auth v38642807 snaprealm=0x55c541856200 f(v0 m2021-12-29T05:05:25.519523+0000) n(v4 rc2021-12-29T05:05:25.527523+0000 1=0+1) old_inodes=1 (iversion lock) 0x55c541851180] 15 mds.0.cache.ino(0x10000eff917) maybe_ephemeral_rand unlinked directory: cannot ephemeral random pin [inode 0x10000eff917 [...d89,head] ~mds0/stray4/10000eff917/ auth v38642807 snaprealm=0x55c541856200 f(v0 m2021-12-29T05:05:25.519523+0000) n(v4 rc2021-12-29T05:05:25.527523+0000 1=0+1) old_inodes=1 (iversion lock) 0x55c541851180] 20 mds.0.cache.dir(0x604.001111000*) _fetched pos 137 marker 'i' dname '10000efded1 [daf,head] 20 mds.0.cache.dir(0x604.001111000*) lookup (head, '10000efded1') 20 mds.0.cache.dir(0x604.001111000*) miss -> (1001124a935,head) 20 mds.0.cache.ino(0x10000efded1) decode_snap_blob snaprealm(0x10000efded1 seq d9e lc 0 cr d9e cps daf snaps={} past_parent_snaps=be5,c3d,c95,ced,d45,d9d 0x55c541856400) 20 mds.0.cache.dir(0x604.001111000*) lookup_exact_snap (head, '10000efded1') 10 mds.0.cache.snaprealm(0x10000efded1 seq 3486 0x55c541856400) adjust_parent 0 -> 0x55ba9f1e2e00 12 mds.0.cache.dir(0x604.001111000*) add_primary_dentry [dentry #0x100/stray4/10000efded1 [daf,head] auth (dversion lock) pv=0 v=38664524 ino=0x10000efded1 state=1073741824 0x55c541854f00] 12 mds.0.cache.dir(0x604.001111000*) _fetched got [dentry #0x100/stray4/10000efded1 [daf,head] auth (dversion lock) pv=0 v=38664524 ino=0x10000efded1 state=1073741824 0x55c541854f00] [inode 0x10000efded1 [...daf,head] ~mds0/stray4/10000efded1/ auth v38652010 snaprealm=0x55c541856400 f(v0 m2022-01-04T19:09:57.033258+0000) n(v118 rc2022-01-04T19:09:57.041258+0000 1=0+1) old_inodes=6 (iversion lock) 0x55c541851700] 15 mds.0.cache.ino(0x10000efded1) maybe_ephemeral_rand unlinked directory: cannot ephemeral random pin [inode 0x10000efded1 [...daf,head] ~mds0/stray4/10000efded1/ auth v38652010 snaprealm=0x55c541856400 f(v0 m2022-01-04T19:09:57.033258+0000) n(v118 rc2022-01-04T19:09:57.041258+0000 1=0+1) old_inodes=6 (iversion lock) 0x55c541851700] 20 mds.0.cache.dir(0x604.001111000*) _fetched pos 136 marker 'i' dname '10000efad6f [daf,head] 20 mds.0.cache.dir(0x604.001111000*) lookup (head, '10000efad6f') 20 mds.0.cache.dir(0x604.001111000*) miss -> (1000539afa8,head) 20 mds.0.cache.ino(0x10000efad6f) decode_snap_blob snaprealm(0x10000efad6f seq d9e lc 0 cr d9e cps daf snaps={} past_parent_snaps=be5,c3d,c95,ced,d45,d9d 0x55c541856600) 20 mds.0.cache.dir(0x604.001111000*) lookup_exact_snap (head, '10000efad6f') 10 mds.0.cache.snaprealm(0x10000efad6f seq 3486 0x55c541856600) adjust_parent 0 -> 0x55ba9f1e2e00 12 mds.0.cache.dir(0x604.001111000*) add_primary_dentry [dentry #0x100/stray4/10000efad6f [daf,head] auth (dversion lock) pv=0 v=38664524 ino=0x10000efad6f state=1073741824 0x55c541855400] 12 mds.0.cache.dir(0x604.001111000*) _fetched got [dentry #0x100/stray4/10000efad6f [daf,head] auth (dversion lock) pv=0 v=38664524 ino=0x10000efad6f state=1073741824 0x55c541855400] [inode 0x10000efad6f [...daf,head] ~mds0/stray4/10000efad6f/ auth v38652024 snaprealm=0x55c541856600 f() n(v0 rc2022-01-04T19:10:53.334202+0000 1=0+1) old_inodes=6 (iversion lock) 0x55c54185c000] 15 mds.0.cache.ino(0x10000efad6f) maybe_ephemeral_rand unlinked directory: cannot ephemeral random pin [inode 0x10000efad6f [...daf,head] ~mds0/stray4/10000efad6f/ auth v38652024 snaprealm=0x55c541856600 f() n(v0 rc2022-01-04T19:10:53.334202+0000 1=0+1) old_inodes=6 (iversion lock) 0x55c54185c000] 20 mds.0.cache.dir(0x604.001111000*) _fetched pos 135 marker 'i' dname '10000ef98c9 [d89,head] 20 mds.0.cache.dir(0x604.001111000*) lookup (head, '10000ef98c9') 20 mds.0.cache.dir(0x604.001111000*) miss -> (100113dd761,head) 20 mds.0.cache.ino(0x10000ef98c9) decode_snap_blob snaprealm(0x10000ef98c9 seq d46 lc 0 cr d46 cps d89 snaps={} past_parent_snaps=b8b,be5,c3d,c95,ced,d45 0x55c541856800) _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx