Re: Spurious MON re-elections

Sylvain Munaut <s.munaut@xxxxxxxxxxxxxxxxxxxx> · Fri, 3 Apr 2015 13:27:43 +0200

Hi,

> And indeed there's nothing in the log for mon.a between 17:49:32.77602
> and 17:50:10.929258, which seems not great. I'd look and see if
> something is happening with your disks, maybe?

Mmm, indeed.

I had checked all the disk with SMART and the RAID controller wasn't
reporting any as failed, but digging deeper I managed to find a log
that one of the two disk in the RAID-1 that stores the monitor data
has had quite a few "aborted command".

I just swapped that disk, I'll see if this completely fix the issue.

> Based on your graphs, actually, the CPU load you're seeing is probably
> the cause, not the effect. An election can increase load some if a
> bunch of client messages get piled up and need to be processed, but
> otherwise it's just a couple messages and a hiccup in processing...

CPU is definitely not the cause.

If you look at the log of mon.b when it becomes leaders, you get things like :

2015-03-28 17:49:58.221540 7fc1e9fed700  5 mon.b@1(leader).osd e19301
send_incremental [19286..19301] to osd.8 10.208.2.213:6814/15970
2015-03-28 17:49:58.221551 7fc1e9fed700  5 mon.b@1(leader).osd e19301
send_latest to osd.8 10.208.2.213:6814/15970 start 19286
2015-03-28 17:49:58.221554 7fc1e9fed700  5 mon.b@1(leader).osd e19301
send_incremental [19286..19301] to osd.8 10.208.2.213:6814/15970

And you get _a_lot_ of these. There is like 23000 of theses emitted in
less than 250 micro-seconds.

Cheers,

   Sylvain
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com