Hi We are observing that when the primary mds goes away(say OOM killer victim), the client keeps on trying (forever) to write to it(try_write method in the messenger) and eventually results in filesystem hang. So the question is : - Why does the kernel client attempt another mds? - Is replication (mds) guaranteed to take place before the primary mds goes down? In other words, is replication done preemtively or due to a trigger (scheduled or event based)? thanks again Jojy -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html