Re: [Gluster-devel] GlusterFS and the logging framework

Nithya Balachandran <nbalacha@xxxxxxxxxx> · Fri, 2 May 2014 00:18:23 -0400 (EDT)

Thanks Joe. Assuming you like approach#2, please let us know of anything else that you would find helpful in the gluster logs.

Thanks,
Nithya

----- Original Message -----
From: "Joe Julian" <joe@xxxxxxxxxxxxxxxx>
To: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>, gluster-devel@xxxxxxxxxxx
Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: Wednesday, 30 April, 2014 7:04:14 PM
Subject: Re: [Gluster-devel] GlusterFS and the logging framework

>From my perspective as someone who supports GlusterFS on IRC and frequently has to read logs in order to determine what went wrong, I like this proposal.

On April 30, 2014 12:06:26 AM PDT, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:
>Hi,
>
>I have attached some DHT files to demonstrate the 2 logging approaches.
>(*_1 is the original approach, *_2 is the proposed approach).I
>personally think the 2 approach leads to better code readability and
>propose that we follow approach 2. Please let me know of any concerns
>with this.
>
>
>To consolidate all the points raised in the earlier discussions:
>
>
>What are we trying to solve?
>Improving gluster logs to make end user debugging easier by providing a
>sufficient information and a consistent logging mechanism and message
>format .
>
>The new logging framework already logs the function name and line,
>msgid and strerror, which improves the log messages and debug-ability.
>However, there are some potential issues with the way it is getting
>used. Please note - there are no changes being proposed to the
>underlying logging framework.
>
>
>Current approach (approach 1):
>
>Define message_ids for each log message (except Trace and Debug) and
>associate both id and string with a msg_id macro
>Replace all calls to gf_log with gf_msg passing in the message_id for
>the message. This message_id will be printed as part of the log
>message.
>Document each log string with details of what caused it/how to fix it.
>
>
>
>Issues:
>1. Code readability - It becomes difficult to figure out what the
>following is actually printing and can cause issues with incorrect
>params being passed or params being passed in the wrong order:
>gf_msg ("dht", GF_LOG_ERROR, 0, dht_msg_23, param1, param2, param3);
>
>2.Code Redundancy -multiple messages for the same thing differing in
>small details can potentially use up a large chunk of allocated ids as
>well as making it difficult for end users - they will need to search
>for multiple string formats/msgids as they could all refer to more or
>less the same thing. For example:
>
>dht_msg_1   123, "Failed to get cached subvol for %s"
>dht_msg_2   124, "Failed to get cached subvol for %s on %s"
>
>
>
>3. Documentation redundancy -
>
>The proposed format for documenting these messages is as follows:
>
>Msg ID
>Message format string
>Cause
>Recommended action
>
>This could potentially lead to documentation like:
>
>Msg ID : 123
>Message format string : Failed to get cached subvol for <path>
>Cause : The subvolume might not be reachable etc etc
>Recommended action : Check network connection  etc etc
>
>Msg ID : 124
>Message format string : Failed to get cached subvol for <path> on
><path2>
>Cause : The subvolume might not be reachable etc etc
>Recommended action : Check network connection  etc etc
>
>The end user now has to search for multiple msgids and string formats
>to find all instances of this error.
>
>NOTE: It may be possible to consolidate all these strings into a single
>one, say, "Failed to get cached subvol for %s on %s" and mandate that
>it be used in all calls which are currently using variations of the
>string. However, this might not be possible in all scenarios - some
>params might not be available or might not be meaningful in a
>particular case or a developer might want to provide additional info in
>a particular scenario.
>
>
>
>Proposed approach (approach 2):
>Define meaningful macros for message_ids for a class of message (except
>Trace and Debug) without associating them to a message string. For
>example
>#define DHT_CACHED_SUBVOL_GET_FAILED 123
>#define DHT_MEM_ALLOC_FAILED         124
>
>
>Replace all calls to gf_log with gf_msg but pass in the msg id and
>string separately. The string is defined by the developer based on an
>agreed upon format.
>
>Define a log message format policy that all developers need to follow.
>This will need to be enforced by reviews. For example, we could mandate
>that all log messages must start with the name of the file on which the
>operation is performed and end with the strerror if it exists.This can
>also include rules as to sentence structure and wording - whether to
>use "failed", "unable to", "could not" etc.
>
>Consolidate existing messages and reword them if necessary to make them
>more meaningful. If a single message will work in multiple instances,
>use that one everywhere.
>
>Add your documentation writer as a reviewer for all patches. S/he will
>be responsible for ensuring that all newly introduced log messages are
>meaningful, consistent and follow the agreed upon format.
>
>Devs will define new message classes ids as and when required.
>
>Ideally, common message classes like dict-set-failed or
>memory-alloc-failed should be defined in a common file and included by
>others - no point having each component define a memory_alloc_failed
>id.
>
>With the proposed approach:
>
>#define DHT_CACHED_SUBVOL_GET_FAILED 123
>#define DHT_HASHED_SUBVOL_GET_FAILED 124
>
>Calls would then look like:
>
>gf_msg ("dht", GF_LOG_ERROR, DHT_CACHED_SUBVOL_GET_FAILED, "Failed to
>get cached subvolume for path %s", param1);
>...
>gf_msg ("dht", GF_LOG_ERROR, DHT_CACHED_SUBVOL_GET_FAILED, "Failed to
>get cached subvolume for path %s on %s", param1, param2);
>
>
>Documentation would be as follows:
>
>Msg ID : 123
>Description : Failed to get the cached subvolume for the specified path
>Cause : The subvolume might not be reachable etc etc
>Recommended action : Check network connection  etc etc
>
>Admins could just search for [MSGID 123] and find all instances of
>where an operation failed to get a cached sub volume.
>
>Issues raised with proposed approach:
>
>1. Internationalization: Having the strings in a single file is
>required to make L10N easy.
>While i18n is very important for user tools, utils etc, log messages
>are usually targeted at developers and sys admins who usually know
>English. Plus it seems unlikely that log messages will be localized in
>the near future. However, the document describing the msgid can be
>localized so the msg id mapping information can still be used.
>
>2. Having the strings in a header file makes it easier to change the
>format later.
>This is a valid point. However, IMHO, the code readability is more
>important especially in the case where we pass arguments to the format
>string.
>
>3. Having a string defined in a single header file can make it easier
>for a dev to reuse it if necessary
>I would suggest searching on the message id instead and copy the string
>from elsewhere if it suits his purpose as those will already have been
>reviewed by the doc writer.
>
>
>Regards,
>Nithya
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Gluster-devel mailing list
>Gluster-devel@xxxxxxxxxxx
>http://supercolony.gluster.org/mailman/listinfo/gluster-devel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users