Re: ceph-mgr hangs on larger clusters in Luminous

John Spray <jspray@xxxxxxxxxx> · Thu, 18 Oct 2018 22:51:45 +0100

On Thu, Oct 18, 2018 at 10:31 PM Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:
>
> I could see something related to that bug might be happening, but we're not seeing the "clock skew" or "signal: Hangup" messages in our logs.
>
>
>
> One reason that this cluster might be running into this problem is that we appear to have a script that is gathering stats for collectd which is running 'ceph pg dump' every 16-17 seconds.  I guess you could say we're stress testing that code path fairly well...  :)

That would be one way of putting it!  Consider moving to the
prometheus module, which includes output about which PGs are in which
states (and does so without serializing every PG's full status...)

John

>
>
> Bryan
>
>
>
> On Thu, Oct 18, 2018 at 6:17 PM Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:
>
>
>
> After we upgraded from Jewel (10.2.10) to Luminous (12.2.5) we started seeing a problem where the new ceph-mgr would sometimes hang indefinitely when doing commands like 'ceph pg dump' on our largest cluster (~1,300 OSDs).  The rest of our clusters (10+) aren't seeing the same issue, but they are all under 600 OSDs each.  Restarting ceph-mgr seems to fix the issue for 12 hours or so, but usually overnight it'll get back into the state where the hang reappears.  At first I thought it was a hardware issue, but switching the primary ceph-mgr to another node didn't fix the problem.
>
>
>
>
>
>
>
> I've increased the logging to 20/20 for debug_mgr, and while a working dump looks like this:
>
>
>
>
>
>
>
> 2018-10-18 09:26:16.256911 7f9dbf5e7700  4 mgr.server handle_command decoded 3
>
>
>
> 2018-10-18 09:26:16.256917 7f9dbf5e7700  4 mgr.server handle_command prefix=pg dump
>
>
>
> 2018-10-18 09:26:16.256937 7f9dbf5e7700 10 mgr.server _allowed_command  client.admin capable
>
>
>
> 2018-10-18 09:26:16.256951 7f9dbf5e7700  0 log_channel(audit) log [DBG] : from='client.1414554763 10.2.4.2:0/2175076978' entity='client.admin' cmd=[{"prefix": "pg dump", "target": ["mgr", ""], "format": "json-pretty"}]: dispatch
>
>
>
> 2018-10-18 09:26:22.567583 7f9dbf5e7700  1 mgr.server reply handle_command (0) Success dumped all
>
>
>
>
>
>
>
> A failed dump call doesn't show up at all.  The "mgr.server handle_command prefix=pg dump" log entry doesn't seem to even make it to the logs.
>
>
>
> This could be a manifestation of
>
> https://tracker.ceph.com/issues/23460, as the "pg dump" path is one of
>
> the places where the pgmap and osdmap locks are taken together.
>
>
>
> Deadlockyness aside, this code path could use some improvement so that
>
> both locks aren't being held unnecessarily, and so that we aren't
>
> holding up all other accesses to pgmap while doing a dump.
>
>
>
> John
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com