Hi Joao, We have a problem when trying to add new monitors to the cluster on an unhealthy cluster, which I would like ask for your suggestion. After adding the new monitor, it started syncing the store and went into an infinite loop: 2015-11-12 21:02:23.499510 7f1e8030e700 10 mon.mon04c011@2(synchronizing) e5 handle_sync_chunk mon_sync(chunk cookie 4513071120 lc 14697737 bl 929616 bytes last_key osdmap,full_22530) v2 2015-11-12 21:02:23.712944 7f1e8030e700 10 mon.mon04c011@2(synchronizing) e5 handle_sync_chunk mon_sync(chunk cookie 4513071120 lc 14697737 bl 799897 bytes last_key osdmap,full_3259) v2 We talked early in the morning on IRC, and at the time I thought it was because the osdmap epoch was increasing, which lead to this infinite loop. I then set those nobackfill/norecovery flags and the osdmap epoch freezed, however, the problem is still there. While the osdmap epoch is 22531, the switch always happened at osdmap.full_22530 (as showed by the above log). Looking at the code at both sides, it looks this check (https://github.com/ceph/ceph/blob/master/src/mon/Monitor.cc#L1389) always true, and I can confirm from the log that (sp.last_commited < paxos->get_version()) was false, so the chance is that the sp.synchronizer always has next chunk? Does this look familiar to you? Or any other trouble shoot I can try? Thanks very much. Thanks, Guang -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html