Re: Not able to start MDS after upgrade to 16.2.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Is the memory ballooning while the MDS is active or could it be while
it is rejoining the cluster?
If the latter, this could be another case of:
https://tracker.ceph.com/issues/54253

Cheers, Dan


On Wed, Feb 9, 2022 at 7:23 PM Izzy Kulbe <ceph@xxxxxxx> wrote:
>
> Hi,
>
> last weekend we upgraded one of our clusters from 16.2.5 to 16.2.7 using
> cephadm.
>
> The upgrade itself seemed to run without a problem but shortly after the
> upgrade we noticed the servers holding the MDS containers being laggy, then
> unresponsive, then crashing outright due getting reaped by the OOM killer.
> The MDS would restart, at some point resulting in a reset of the whole
> machine.
>
> At this point the cluster status started showing a lot of PGs with
> snaptrim/snaptrim_wait status, so we waited for those to finish with the
> MDS shutdown.
>
> So far everything we've tried changed nothing about our situation - our MDS
> will try to hog extreme amounts of memory they never needed before. Even
> when we push the cache down to 1G or reset it to the default the will
> report as having oversized caches, using up 150+GB of RAM+swap until
> resource exhaustion makes the OOM killer reap them again.
>
> Looking at the output of ceph fs status and the logs it seems it's trying
> to push too much into the cache. The MDS crashes at around 25-30M inodes
> and a few thousand dirs. ceph fs status reports the MDS as being active,
> unlike a few other posts with similar issues the MDS does not seem to be
> stuck in replay. The journal reports no issues. Additionally perf reports
> no strays, nor does listomapvals.
>
> The cluster contains ~33TB in ~220M files. There are multiple snapshots for
> several folders. While ceph fs status reports the MDS as active, I am not
> able to mount the filesystem.
>
> I will probably recreate/resync this soon but I'd still like to find out
> why this happened since we plan on using Ceph in a few other applications
> but at the moment I wouldn't feel comfortable using CephFS given that a
> single cephadm upgrade command made these amounts of data unavailable.
>
> Below are a few lines of log that basically make up the MDS' logs until it
> runs into "mds.0.purge_queue push: pushing inode" and shortly after that
> the crash.
>
> Any pointers to what went wrong here, how to debug this further or how to
> fix this would be greatly appreciated. I'll probably have the broken system
> for another day or two before resetting/recreating the FS.
>
> Thanks,
> Izzy Kulbe
>
>
> MDS Log:
>
> 20 mds.0.cache.dir(0x604.001111000*) lookup_exact_snap (head, '10000eff917')
> 10  mds.0.cache.snaprealm(0x10000eff917 seq 3398 0x55c541856200)
> adjust_parent 0 -> 0x55ba9f1e2e00
> 12 mds.0.cache.dir(0x604.001111000*) add_primary_dentry [dentry
> #0x100/stray4/10000eff917 [d89,head] auth (dversion lock) pv=0 v=38664524
> ino=0x10000eff917 state=1073741824 0x55c541854a00]
> 12 mds.0.cache.dir(0x604.001111000*) _fetched  got [dentry
> #0x100/stray4/10000eff917 [d89,head] auth (dversion lock) pv=0 v=38664524
> ino=0x10000eff917 state=1073741824 0x55c541854a00] [inode 0x10000eff917
> [...d89,head] ~mds0/stray4/10000eff917/ auth v38642807
> snaprealm=0x55c541856200 f(v0 m2021-12-29T05:05:25.519523+0000) n(v4
> rc2021-12-29T05:05:25.527523+0000 1=0+1) old_inodes=1 (iversion lock)
> 0x55c541851180]
> 15 mds.0.cache.ino(0x10000eff917) maybe_ephemeral_rand unlinked directory:
> cannot ephemeral random pin [inode 0x10000eff917 [...d89,head]
> ~mds0/stray4/10000eff917/ auth v38642807 snaprealm=0x55c541856200 f(v0
> m2021-12-29T05:05:25.519523+0000) n(v4 rc2021-12-29T05:05:25.527523+0000
> 1=0+1) old_inodes=1 (iversion lock) 0x55c541851180]
> 20 mds.0.cache.dir(0x604.001111000*) _fetched pos 137 marker 'i' dname
> '10000efded1 [daf,head]
> 20 mds.0.cache.dir(0x604.001111000*) lookup (head, '10000efded1')
> 20 mds.0.cache.dir(0x604.001111000*)   miss -> (1001124a935,head)
> 20 mds.0.cache.ino(0x10000efded1) decode_snap_blob snaprealm(0x10000efded1
> seq d9e lc 0 cr d9e cps daf snaps={}
> past_parent_snaps=be5,c3d,c95,ced,d45,d9d 0x55c541856400)
> 20 mds.0.cache.dir(0x604.001111000*) lookup_exact_snap (head, '10000efded1')
> 10  mds.0.cache.snaprealm(0x10000efded1 seq 3486 0x55c541856400)
> adjust_parent 0 -> 0x55ba9f1e2e00
> 12 mds.0.cache.dir(0x604.001111000*) add_primary_dentry [dentry
> #0x100/stray4/10000efded1 [daf,head] auth (dversion lock) pv=0 v=38664524
> ino=0x10000efded1 state=1073741824 0x55c541854f00]
> 12 mds.0.cache.dir(0x604.001111000*) _fetched  got [dentry
> #0x100/stray4/10000efded1 [daf,head] auth (dversion lock) pv=0 v=38664524
> ino=0x10000efded1 state=1073741824 0x55c541854f00] [inode 0x10000efded1
> [...daf,head] ~mds0/stray4/10000efded1/ auth v38652010
> snaprealm=0x55c541856400 f(v0 m2022-01-04T19:09:57.033258+0000) n(v118
> rc2022-01-04T19:09:57.041258+0000 1=0+1) old_inodes=6 (iversion lock)
> 0x55c541851700]
> 15 mds.0.cache.ino(0x10000efded1) maybe_ephemeral_rand unlinked directory:
> cannot ephemeral random pin [inode 0x10000efded1 [...daf,head]
> ~mds0/stray4/10000efded1/ auth v38652010 snaprealm=0x55c541856400 f(v0
> m2022-01-04T19:09:57.033258+0000) n(v118 rc2022-01-04T19:09:57.041258+0000
> 1=0+1) old_inodes=6 (iversion lock) 0x55c541851700]
> 20 mds.0.cache.dir(0x604.001111000*) _fetched pos 136 marker 'i' dname
> '10000efad6f [daf,head]
> 20 mds.0.cache.dir(0x604.001111000*) lookup (head, '10000efad6f')
> 20 mds.0.cache.dir(0x604.001111000*)   miss -> (1000539afa8,head)
> 20 mds.0.cache.ino(0x10000efad6f) decode_snap_blob snaprealm(0x10000efad6f
> seq d9e lc 0 cr d9e cps daf snaps={}
> past_parent_snaps=be5,c3d,c95,ced,d45,d9d 0x55c541856600)
> 20 mds.0.cache.dir(0x604.001111000*) lookup_exact_snap (head, '10000efad6f')
> 10  mds.0.cache.snaprealm(0x10000efad6f seq 3486 0x55c541856600)
> adjust_parent 0 -> 0x55ba9f1e2e00
> 12 mds.0.cache.dir(0x604.001111000*) add_primary_dentry [dentry
> #0x100/stray4/10000efad6f [daf,head] auth (dversion lock) pv=0 v=38664524
> ino=0x10000efad6f state=1073741824 0x55c541855400]
> 12 mds.0.cache.dir(0x604.001111000*) _fetched  got [dentry
> #0x100/stray4/10000efad6f [daf,head] auth (dversion lock) pv=0 v=38664524
> ino=0x10000efad6f state=1073741824 0x55c541855400] [inode 0x10000efad6f
> [...daf,head] ~mds0/stray4/10000efad6f/ auth v38652024
> snaprealm=0x55c541856600 f() n(v0 rc2022-01-04T19:10:53.334202+0000 1=0+1)
> old_inodes=6 (iversion lock) 0x55c54185c000]
> 15 mds.0.cache.ino(0x10000efad6f) maybe_ephemeral_rand unlinked directory:
> cannot ephemeral random pin [inode 0x10000efad6f [...daf,head]
> ~mds0/stray4/10000efad6f/ auth v38652024 snaprealm=0x55c541856600 f() n(v0
> rc2022-01-04T19:10:53.334202+0000 1=0+1) old_inodes=6 (iversion lock)
> 0x55c54185c000]
> 20 mds.0.cache.dir(0x604.001111000*) _fetched pos 135 marker 'i' dname
> '10000ef98c9 [d89,head]
> 20 mds.0.cache.dir(0x604.001111000*) lookup (head, '10000ef98c9')
> 20 mds.0.cache.dir(0x604.001111000*)   miss -> (100113dd761,head)
> 20 mds.0.cache.ino(0x10000ef98c9) decode_snap_blob snaprealm(0x10000ef98c9
> seq d46 lc 0 cr d46 cps d89 snaps={}
> past_parent_snaps=b8b,be5,c3d,c95,ced,d45 0x55c541856800)
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux