On Thu, 29 Mar 2018, John Spray wrote: > On Wed, Mar 28, 2018 at 11:44 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > I had an amusing little problem today with a bug report about IO > > pausing on a cluster when OSDs are killed. Naturally, the first thing > > I wanted to do was see if it was the result of OSDs not getting marked > > down, or if the PGs were not peering quickly after that. > > > > Only it turns out that in Luminous, we no longer log the pg states to > > any single log I can find. ceph.log now contains only the health > > summary; I wasn't provided the mgr log but it appears to require debug > > 10 before printing out individual states. > > Let's change that to something like 4 instead of 10 so that it's at > least easier to get at them directly on the daemon? > > > This means the only way to > > get them is to have a high debug value while the logs are running (and > > I don't think this is something people are used to on the manager > > yet), and that any issues in the field will be difficult to resolve if > > they aren't immediately reproducible. > > The purist answer is that the PG states are included in the prometheus > output, which is a neater way of getting this kind of history of > quantitative things. However, I'm not a purist, so... > > > So: I'm pretty sure we need to log PG state changes in more detail by > > default. Does anybody have suggestions or preferences for *how* that > > happens? My preference is for them to show up in ceph.log... > > ... we could reinstate the PGMap spam at debug level in its own > channel in the cluster log, if we made LogMonitor keep separate > summary buffers for each channel. Currently it has one global buffer, > which means that any regular output (like the PGMap every 5 seconds) > will blow away the recent history of any other type of log message -- > that was the motivation for eliminating the PGMap message rather than > just degrading it to debug. This is on Joao's todo list.. Joao, do you have any estimate? Or are there other takers? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html