Re: [Gluster-Maintainers] [Gluster-devel] Modifying gluster's logging mechanism

Barak Sason Rofman <bsasonro@xxxxxxxxxx> · Sun, 24 Nov 2019 10:56:54 +0200

Thank you all for participating in this discussion.

Regarding Yaniv's comments:
If
 you need an external tool (please not Java - let's not add another 
language to the project), you might as well move to binary logging.
I believe we need to be able to do this sort using Linux command line tools only.
My intention was not at all to perform ordering using Java.
Ordering of the logs can be done easily using Python or even C (these are the tools that I know). 
I'd be happy to know how this can be done with just Linux CLI, please share your insight.
Regarding Java - I only meant to use it as a GUI tool (Java Swift).
A GUI tool that presents ordered logs in a couple of ways (e.g. a tab that shows the sorted logs for all threads, separate tabs per threads etc') may add some value.
Regarding the binary logging, actually the system I proposed is already "semi binary", as it logs time-stamp as raw hex. A switch to full binary is fairly simple and I do see advantages with that proposal. 

This is not a fair comparison:
1. The regression tests are running with debug log
2. Not logging at all != replacing the logging framework - the new one will have its own overhead as well.
3.
 I believe there's a lot of code that is not real life scenarios there -
 such as dumping a lot of data there (which causes a lot of calls to 
inode_find() and friends - for example).
1 - Actually I'm not sure about this. need to verify. 
2 - I haven't claimed that with a new system we'd suddenly have a 20% performance increase. I have pointed out a potential problematic influence of the current mechanism that users (and developers) may be unaware of (as Strahil's comment suggests).
3 - That's the power of a community. I encourage users and developers to perform further tests on the matter, with real  life scenarios, so we'd have a better understanding of the impact.

Regarding Ravi's comments:
maintaining causality of messages and working 

with command line text editors and tools on log files is important IMO.
Will running a tool in the form of "# logOrderer /someDireWithLogs" and having logs sorted in the way they are currently sorted will be so bad?
Furthermore, The system I proposed can easily produce ordered logs if no threads are registered for a private buffer (As I mentioned, and as the project documentation mentions, if a thread doesn;t have a private buffer, he automatically falls down to "level 2" writing, which is writing to a shared buffer - basic async logging. Level 2 and 3 maintain log ordering).

I think at this point we can focus the discussion at 2 points:
1 - Do we want to change the current system?
The current mechanism is "synced" logging, which definitely hurts performance. Are we OK with taking that hit or do we want to improve?
"Async logging" is not a new concept it certainly has it's advantages over "synced" logging.
2 - Given that the answer for question 1 is "yes", what do we require from the new logging system?
I proposed a system I've been developing as a side project for the past couple of weeks and I'd appreciate looking at the proposed mechanism if comments are made specifically on it (and remember that the project is still a work in progress).
Lastly, there are a lot of logging alternatives out there (e.g. Log4c) which are definitely worth consideration.

Lastly I want to adress Strahil's comment:
As an end user, I think that performance improvents must be of outmost priority and this 'async logging' approach makes  sense.
Actually, you make me think If I really need  such detailed
 logs (I'm running an oVirt lab) , as I can benefit from logless  
gluster's performance.
Obviously there are many  different users with many different use cases out there and I believe we should be flexible enough to provide them with a suitable solution for their needs. I'd hate to see users turn off logging just because it hurts their performance as it would hurt our ability as developers to provide support when needed.

Again, thank you for participating and looking forward for more comments and input, 

On Fri, Nov 22, 2019 at 12:19 PM Ravishankar N <ravishankar@xxxxxxxxxx> wrote:

On 22/11/19 3:13 pm, Barak Sason Rofman wrote:

> This is actually one of the main reasons I wanted to bring this up for 

> discussion - will it be fine with the community to run a dedicated 

> tool to reorder the logs offline?

I think it is a bad idea to log without ordering and later relying on an 

external tool to sort it.  This is definitely not something I would want 

to do while doing test and development or debugging field issues.  

Structured logging  is definitely useful for gathering statistics and 

post-processing to make reports and charts and what not,  but from a 

debugging point of view, maintaining causality of messages and working 

with command line text editors and tools on log files is important IMO.

I had a similar concerns when  brick multiplexing feature was developed 

where a single log file was used for logging all multiplexed bricks' 

logs.  So much extra work to weed out messages of 99 processes to read 

the log of the 1 process you are interested in.

Regards,

Ravi

-- 
Barak Sason Rofman
Gluster Storage Development
Red Hat Israel
34 Jerusalem rd. Ra'anana, 43501
bsasonro@redhat.com    T: +972-9-7692304
M: +972-52-4326355

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users