Re: debugging pg states

John Spray <jspray@xxxxxxxxxx> · Mon, 2 Apr 2018 13:29:01 +0100

On Thu, Mar 29, 2018 at 10:16 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Thu, Mar 29, 2018 at 3:25 AM, John Spray <jspray@xxxxxxxxxx> wrote:
>> On Wed, Mar 28, 2018 at 11:44 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>> I had an amusing little problem today with a bug report about IO
>>> pausing on a cluster when OSDs are killed. Naturally, the first thing
>>> I wanted to do was see if it was the result of OSDs not getting marked
>>> down, or if the PGs were not peering quickly after that.
>>>
>>> Only it turns out that in Luminous, we no longer log the pg states to
>>> any single log I can find. ceph.log now contains only the health
>>> summary; I wasn't provided the mgr log but it appears to require debug
>>> 10 before printing out individual states.
>>
>> Let's change that to something like 4 instead of 10 so that it's at
>> least easier to get at them directly on the daemon?
>>
>>> This means the only way to
>>> get them is to have a high debug value while the logs are running (and
>>> I don't think this is something people are used to on the manager
>>> yet), and that any issues in the field will be difficult to resolve if
>>> they aren't immediately reproducible.
>>
>> The purist answer is that the PG states are included in the prometheus
>> output, which is a neater way of getting this kind of history of
>> quantitative things.  However, I'm not a purist, so...
>
> Yeah, this is a neat solution but I think for real-world debugging we
> need a better transition from the current state of affairs.
>
> Is there any plausible way for picking up prometheus states via the
> existing ceph debugging tools, or for integrating that with the Ceph
> logging events? If not I think we need it in-situ. Plus, isn't the
> in-memory prometheus logging quite short?

Reinstating the logging (with the LogMonitor improvement to avoid
filling up a global buffer) is probably the only real "existing tools"
path here.

The prometheus module is just giving you a moment-in-time view, so it
doesn't do anything for you without an external thing querying it.  Of
course, it's also very easy to write a few lines of python that hits
the prometheus endpoint and logs something even N seconds, but at that
point I'd just install prometheus.

John

> -Greg
>
>>
>>> So: I'm pretty sure we need to log PG state changes in more detail by
>>> default. Does anybody have suggestions or preferences for *how* that
>>> happens? My preference is for them to show up in ceph.log...
>>
>> ... we could reinstate the PGMap spam at debug level in its own
>> channel in the cluster log, if we made LogMonitor keep separate
>> summary buffers for each channel.  Currently it has one global buffer,
>> which means that any regular output (like the PGMap every 5 seconds)
>> will blow away the recent history of any other type of log message --
>> that was the motivation for eliminating the PGMap message rather than
>> just degrading it to debug.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html