On 06/04/2019 07:01 PM, Jianyu Li wrote: > Hello, > > I have a ceph cluster running over 2 years and the monitor began crash > since yesterday. I had some flapping OSDs up and down occasionally, > sometimes I need to rebuild the OSD. I found 3 OSDs are down yesterday, > they may cause this issue or may not. > > Ceph Version: 12.2.12, ( upgraded from 12.2.8 not fix the issue) > I have 5 mon nodes, when I start mon service on the first 2 nodes, they > are good. Once I start the service on the third node, All 3 nodes begin > keeping up/down(flapping) due to Aborted in > OSDMonitor::build_incremental. I also tried to recover monitor from 1 > node(remove other 4 nodes) by injecting monmap, the node keep crash as > well. Please increase debug levels to 'debug_mon = 10', 'debug_paxos = 10', and send us the log once you have your next crash. This may be a few things, but I'm guessing your other monitors have a corrupted store somehow. Were there any hardware failures recently before the crashes started happening? -Joao _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com