Re: Logging in a multi-brick daemon

Shyam <srangana@xxxxxxxxxx> · Thu, 16 Feb 2017 09:36:13 -0500

On 02/16/2017 05:27 AM, Rajesh Joseph wrote:
On Thu, Feb 16, 2017 at 9:46 AM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 02/16/2017 04:09 AM, Jeff Darcy wrote:

One of the issues that has come up with multiplexing is that all of the
bricks in a process end up sharing a single log file.  The reaction from
both of the people who have mentioned this is that we should find a way to
give each brick its own log even when they're in the same process, and make
sure gf_log etc. are able to direct messages to the correct one.  I can
think of ways to do this, but it doesn't seem optimal to me.  It will
certainly use up a lot of file descriptors.  I think it will use more
memory.  And then there's the issue of whether this would really be better
for debugging.  Often it's necessary to look at multiple brick logs while
trying to diagnose this problem, so it's actually kind of handy to have them
all in one file.  Which would you rather do?

(a) Weave together entries in multiple logs, either via a script or in
your head?

(b) Split or filter entries in a single log, according to which brick
they're from?

To me, (b) seems like a much more tractable problem.  I'd say that what we
need is not multiple logs, but *marking of entries* so that everything
pertaining to one brick can easily be found.  One way to do this would be to
modify volgen so that a brick ID (not name because that's a path and hence
too long) is appended/prepended to the name of every translator in the
brick.  Grep for that brick ID, and voila!  You now have all log messages
for that brick and no other.  A variant of this would be to leave the names
alone and modify gf_log so that it adds the brick ID automagically (based on
a thread-local variable similar to THIS).  Same effect, other than making
translator names longer, so I'd kind of prefer this approach.  Before I
start writing the code, does anybody else have any opinions, preferences, or
alternatives I haven't mentioned yet?

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

My vote is for having separate log files per brick. Even in separate log
files
that we have today, I find it difficult to mentally ignore irrelevant
messages
in a single log file as I am sifting through it to look for errors that are
related to the problem at hand. Having entries from multiple bricks and then
grepping it would only make things harder. I cannot think of a case where
having
entries from all bricks in one file would particularly be beneficial for
debugging since what happens in one brick is independent of the other bricks
(at least until we move client xlators to server side and run them in the
brick process).
As for file descriptor count/memory usage, I think we should be okay
as it is not any worse than that in the non-multiplexed approach we have
today.

On a side-note, I think the problem is not having too many log files but
having
them in multiple nodes. Having a log-aggregation solution where all messages
are
logged to a single machine (but still in separate files) would make it
easier to
monitor/debug issues.
-Ravi

I believe the logs are not just from one volume but from all. In that
case merging them
into a single log file may not be great for debugging. Especially in
container use cases
there can be multiple volumes. Yes, with some tagging and scripting we
can separate
the logs and still live with it.

In container world, *I believe* centralized logging (using something 
like an ELK/EFK stack) would be the way to go, than collecting logs from 
each gluster (or application mount) container/node. In these situations 
we are going to get logs from different volumes anyway, or at best a 
filtered list from whichever stack is used for the centralized logging.

So, I would think (as being described) we would need to have enough 
identifiers in the log message, such that we can filter appropriately 
and that should take care of the debugging concern.

Of course building these scripts out from the beginning and possibly 
even shipping them with our RPMs may help a great deal, than having to 
roll one out when we get into troubleshooting or debugging a setup.

What about the log levels? Each volume can configure different log
levels. Will you carve
out a separate process in case log levels are changed for a volume?
How is this handled
here?

-Rajesh
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel