Re: MON slow ops and growing MON store

Janek Bevendorff <janek.bevendorff@xxxxxxxxxxxxx> · Fri, 26 Feb 2021 15:24:04 +0100

Since the full cluster restart and disabling logging to syslog, it's not 
a problem any more (for now).

Unfortunately, just disabling clog_to_monitors didn't have the wanted 
effect when I tried it yesterday. But I also believe that it is somehow 
related. I could not find any specific reason for the incident yesterday 
in the logs besides a few more RocksDB status and compact messages than 
usual, but that's more symptomatic.

On 26/02/2021 13:05, Mykola Golub wrote:
On Thu, Feb 25, 2021 at 08:58:01PM +0100, Janek Bevendorff wrote:

On the first MON, the command doesn’t even return, but I was able to
get a dump from the one I restarted most recently. The oldest ops
look like this:

         {
             "description": "log(1000 entries from seq 17876238 at 2021-02-25T15:13:20.306487+0100)",
             "initiated_at": "2021-02-25T20:40:34.698932+0100",
             "age": 183.762551121,
             "duration": 183.762599201,
The mon stores cluster log messages in the mon db. You mentioned
problems with osds flooding with log messages. It looks like related.

If you still observe the db growth you may try temporarily disable
clog_to_monitors, i.e. set for all osds:

  clog_to_monitors = false

And see if it stops growing after this and if it helps with the slow
ops (it might make sense to restar mons if some look like get
stuck). You can apply the config option on the fly (without restarting
the osds, e.g with injectargs), but when re-enabling back you will
have to restart the osds to avoid crashes due to this bug [1].

[1] https://tracker.ceph.com/issues/48946

--

Bauhaus-Universität Weimar
Bauhausstr. 9a, R308
99423 Weimar, Germany

Phone: +49 3643 58 3577
www.webis.de
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx