Hi Henning, On Wed, May 17, 2023 at 9:25 PM Henning Achterrath <achhen@xxxxxxxxxxx> wrote: > > Hi all, > > we did a major update from Pacific to Quincy (17.2.5) a month ago > without any problems. > > Now we have tried a minor update from 17.2.5 to 17.2.6 (ceph orch > upgrade). It stucks at mds upgrade phase. At this point the cluster > tries to scale down mds (ceph fs set max_mds 1). We waited a few hours. > > We are running two active mds with 1 standby. No subdir pinning > configured. CephFS data pool: 575 TB > > While Upgrading, Rank 1 MDS remains in state stopping. During this state > clients are not able to reconnect. So we paused this upgrade and set > max_mds to 2 back again and fail rank 1. After that, standby becomes active. > > In the mds (rank 1 in stopping state) logs we can see: waiting for > strays to migrate mds.1 on shutdown will export its strays to mds.0 - this is expected. > > In our second try, we have evicted all clients first without success. > > We make daily snapshots on / and rotate them via snapshot scheduler > after one week. > > Is there a way to get rid of stray entries without scale down mds or do > we have to wait longer? Do you see the perf counters related to strays (esp. strays_migrated) increasing for mds.1? If those are not changing, then the stray export has probably hung - which could be due to a bug. If you see this, could you send back the logs for both ranks? (assuming those are debug logs). > > We had about the same amount of strays before we did the major upgrade. > So, it is a bit curious. > > Current output from ceph perf dump > > Rank0: > > "num_strays": 417304, > "num_strays_delayed": 3, > "num_strays_enqueuing": 0, > "strays_created": 567879, > "strays_enqueued": 561803, > "strays_reintegrated": 13751, > "strays_migrated": 4, > > > Rank1: > > ceph daemon mds.fdi-cephfs.ceph-service-13.rwdkqs perf dump | grep stray > > "num_strays": 172528, > "num_strays_delayed": 0, > "num_strays_enqueuing": 0, > "strays_created": 418365, > "strays_enqueued": 396142, > "strays_reintegrated": 67406, > "strays_migrated": 4, > > > > Any help would be appreciated. > > > best regards > Henning > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx