Re: Cluster log: hiding cluster map prints, decreasing audit log messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, Jun 16, 2017 at 5:27 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Thu, 15 Jun 2017, John Spray wrote:
>> Some musings from me about the cluster log, interested in others' thoughts.

OK, you asked :)

When we attempt to clean up logging we walk a fine line between making the logs
"readable" for admins and removing useful information that can be crucial in the
event of an issue. Both requirements hold great merit of course and are kind of
diametrically opposed to each other making this tricky. The more times we can
avoid the scenario where we need to ask for logs at a higher debug level the
better the experience for the user as it is not always possible to reproduce
these things on demand.

So... I was wondering if we could produce two logs, an expurgated log for
system administrator consumption containing only information considered relevant
for that level of consumption, and a debugging log containing the level of
detail support/devel would prefer to see if trying to debug a problem. The debug
log could even be compressed on the fly if we are concerned about space. That
might give us the "best of both worlds" approach?

As I said, you asked :) (seriously though, I'm not overly attached to this idea
so please feel free to shoot it down in flames however, I think it is worthy of
consideration).

>>
>> Audit log
>> =======
>>
>> The audit logging is nice, but it has a couple of noticeable annoyances:
>>  - it crowds out real health messages, e.g. when using the new "log
>> last" command you may see mostly see audit log messages, especially if
>> a monitoring tool is polling some commands.
>>  - the messages themselves are ugly JSON dumps
>>
>> We already have a separate channel for these, so it's easy for UIs to
>> split out the audit stuff (I just did this in the dashboad module),
>> but I think they're still consuming some number of the lines when we
>> fetch a set number of lines using "log last".
>>
>> Maybe things like log last (and the internal buffering in the mon)
>> should keep N lines for each channel, rather than channels competing
>> for the space?  It might already be like that on some level, haven't
>> dug into the mon internals yet.
>
> - last N for each channel makes sense to me.  The way LogMonitor
> implements this needs a bit of cleanup at the same time.
> - I think it makes sense to only show the 'default' channel by default and
> ignore all others (including audit log) unless asked for?  I guess this
> would add a channel option to 'last' that takes either the channel name
> (audit or default) or '*' or 'all' for all.
>
>> Cluster maps
>> ===========
>>
>> It is very nice for debugging that we can see updates to osdmap/fsmap
>> ticking by as the mon updates the state of the system.  However, it
>> kind of disrupts our ability to output clearly readable log messages
>> for ordinary users when things changed.
>>
>> Maybe the cluster maps should be on a separate channel, like the audit logs are?
>>
>> Of course, when we're hiding the low level cluster map prints away, we
>> need to at the same time make sure we're adding in the right high
>> level "OSD 123 went down" messages to replace where the "osdmap e456
>> ..." lines currently give you the hint that something happened.
>
> We could also just put these at the DBG level so that they are hidden by
> default...
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,
Brad
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux