MDS flapping: how to increase MDS timeouts?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



HI,


we are running two MDS servers in active/standby-replay setup. Recently we had to disconnect active MDS server, and failover to standby works as expected.


The filesystem currently contains over 5 million files, so reading all the metadata information from the data pool took too long, since the information was not available on the OSD page caches. The MDS was timed out by the mons, and a failover switch to the former active MDS (which was available as standby again) happened. This MDS in turn had to read the metadata, again running into a timeout, failover, etc. I resolved the situation by disabling one of the MDS, which kept the mons from failing the now solely available MDS.


So given a large filesystem, how do I prevent failover flapping between MDS instances that are in the rejoin state and reading the inode information?

Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux