Thank you for your input Atin and Xie Changlong.
Regarding log ordering - my initial thought was to do it offline using a dedicated too. Should be straight forward, as the logs have time stamp composed of seconds and microseconds, so ordering them using this value is definitely possible.
This is actually one of the main reasons I wanted to bring this up for discussion - will it be fine with the community to run a dedicated tool to reorder the logs offline?
Reordering the logs offline will allow us to gain the most performance improvement, as keeping the logs order online will have some cost (probably through stronger synchronization).
Moreover, we can take log viewing one step further and maybe create some GUI system (JAVA based?) to view and handle logs (e.g. one window to show the combined order logs, other windows to show logs per thread etc').
Regarding the test method - my initial testing was done by removing all logging from regression. Modifying the method "skip_logging" to return 'true' in all cases seems to remove most of the logs (though not all, "to be on the safe side", really removing all logging related methods is probably even better).
As regression tests use mostly single-node tests, some additional testing was needed. I've written a couple of very basic scripts to create large number of files / big files, read / write to / from them, move them around and perform some other basic functionality.
I'd actually be glad to test this in some 'real world' use cases - if you have specific use cases that you use frequently, we can model them and benchmark against - this will likely offer an even more accurate benchmark.
On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong <
zgrep@xxxxxxx> wrote:
在 2019/11/21 21:04, Barak Sason Rofman
写道:
I see two design / implementation problems with that mechanism:
The mutex that guards the log file is likely under constant contention.
The fact that each worker thread perform the IO by himself, thus slowing his "real" work.
Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.
Hi Barak Sason Rofman. Amazing perf improvement! Could show me
the detail test method ?
Thanks
-Xie
In addition to these problems, the logging module is due for an upgrade:
There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.
One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.
--
Gluster Storage Development
________
Community Meeting Calendar:
APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968
NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users