Benchmarks of logging using dout vs LTTng

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

In last week's performance call I presented benchmark results for
logging using dout vs LTTng. The cluster is all flash storage. Here are
the results:

https://docs.google.com/spreadsheets/d/1NfNBV8sm1_tgdBYmMAfmoEufZkkUNpkJNZ0gsOK4RgE/edit?usp=sharing

The PR is here:

https://github.com/ceph/ceph/pull/22458

Note that I only converted log statements in BlueStore.cc. The rest of
the codebase is unchanged, and the default log levels are set.

The take-aways are the following:

- There is a noteworthy performance difference only for high log levels
(debug_bluestore = 20). At a log level of 1, the performance difference
isn't significant.

- The performance difference is mostly shown for high queue depths,
which probably increases contention over some resource on the OSDs.
Looks like there might be a lock on an internal buffer for the log
statements of dout, or it might be due to serialization of the messages
to the output file. LTTng uses per-CPU buffers and per-CPU files which
is why we don't see scalability issues.

- The difference of the log files' disk size is significant, about an
order of magnitude smaller when using LTTng (this is not shown in these
results)

The downside of converting tracepoints to LTTng is the usability
concerns both for users and developers. For users, LTTng has to be
running, we need to add tracing session management, etc. For developers,
adding tracepoints to the code is not as easy as dout. In my PR, there
are two ways of adding LTTng tracepoints in Ceph:

1)
https://github.com/ceph/ceph/pull/22458/files#diff-a9faffcf40600fd57aea5451cef5abe9R787

And

2)
https://github.com/ceph/ceph/pull/22458/files#diff-a9faffcf40600fd57aea5451cef5abe9R11344

The second way is easier, but it is less efficient as everything is
converted to strings. It would still benefit from LTTng's per-CPU
logging though. Unfortunately, I didn't benchmark it. The benchmark
results were done using the first way, which uses basic types whenever
possible instead of converting everything to strings. The downside is
that the type of each field has to be specified.

What do you think can be done following these results? One option
obviously is to keep using dout since the performance penalty is
acceptable for the default log levels. We could then rework some of
dout's code to reduce the tail latency and reduce contention if
possible. Another option would be to change the messages of high log
levels to LTTng and keep the critical messages (dout(0) and dout(1))
using dout.

Either way, I think being more mindful when adding log statements to the
code is important: only log short messages, reduce string copying as
much as possible, only log the necessary information to reduce the usage
of the << operator as it does a lot of conversion of basic types to
strings, be conservative when setting dout_prefix, etc.

Mohamad

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux