Nautilus 14.2.19 mon 100% CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I upgraded our Luminous cluster to Nautilus a couple of weeks ago and converted the last batch of FileStore OSDs to BlueStore about 36 hours ago. Yesterday our monitor cluster went nuts and started constantly calling elections because monitor nodes were at 100% and wouldn't respond to heartbeats. I reduced the monitor cluster to one to prevent the constant elections and that let the system limp along until the backfills finished. There are large amounts of time where ceph commands hang with the CPU is at 100%, when the CPU drops I see a lot of work getting done in the monitor logs which stops as soon as the CPU is at 100% again.

I did a `perf top` on the node to see what's taking all the time and it appears to be in the rocksdb code path. I've set `mon_compact_on_start = true` in the ceph.conf but that does not appear to help. The `/var/lib/ceph/mon/` directory is 311MB which is down from 3.0 GB while the backfills were going on. I've tried adding a second monitor, but it goes back to the constant elections. I tried restarting all the services without luck. I also pulled the monitor from the network work and tried restarting the mon service isolated (this helped a couple of weeks ago when `ceph -s` would cause 100% CPU and lock up the service much worse than this) and didn't see the high CPU load. So I'm guessing it's triggered from some external source.

I'm happy to provide more info, just let me know what would be helpful.

Thank you,
Robert LeBlanc

image.png
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux