mds behind on trimming - replay until memory exhausted

Francois Legrand <fleg@xxxxxxxxxxxxxx> · Fri, 5 Jun 2020 11:42:42 +0200

Hi all,
We have a ceph nautilus cluster (14.2.8) with two cephfs filesystem and 
3 mds (1 active for each fs + one failover).
We are transfering all the datas (~600M files) from one FS (which was in 
EC 3+2) to the other FS (in R3).
On the old FS we first removed the snapshots (to avoid strays problems 
when removing files) and the ran some rsync deleting the files after the 
transfer.
The operation should last a few weeks more to complete.
But few days ago, we started to have some warning mds behind on trimming 
from the mds managing the old FS.
Yesterday, I restarted the active mds service to force the takeover by 
the standby mds (basically because the standby is more powerfull and 
have more memory, i.e 48GB over 32).
The standby mds took the rank 0 and started to replay... the mds behind 
on trimming came back and the number of segments rised as well as the 
memory usage of the server. Finally, it exhausted the memory of the mds 
and the service stopped and the previous mds took rank 0 and started to 
replay... until memory exhaustion and a new switch of mds etc...
It thus seems that we are in a never ending loop ! And of course, as the 
mds is always in replay, the data are not accessible and the transfers 
are blocked.
I stopped all the rsync and unmount the clients.

My questions are :
- Does the mds trim during the replay so we could hope that after a 
while it will purge everything and the mds will be able to become active 
at the end ?
- Is there a way to accelerate the operation or to fix this situation ?

Thanks for you help.
F.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx