Re: [RFC PATCH v2 2/2] trace2: don't overload target directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/2/2019 6:02 PM, Josh Steadmon wrote:
trace2 can write files into a target directory. With heavy usage, this
directory can fill up with files, causing difficulty for
trace-processing systems.

This patch adds a config option (trace2.maxFiles) to set a maximum
number of files that trace2 will write to a target directory. The
following behavior is enabled when the maxFiles is set to a positive
integer:
   When trace2 would write a file to a target directory, first check
   whether or not the directory is overloaded. A directory is overloaded
   if there is a sentinel file declaring an overload, or if the number of
   files exceeds trace2.maxFiles. If the latter, create a sentinel file
   to speed up later overload checks.

The assumption is that a separate trace-processing system is dealing
with the generated traces; once it processes and removes the sentinel
file, it should be safe to generate new trace files again.

The default value for trace2.maxFiles is zero, which disables the
overload check.

The config can also be overridden with a new environment variable:
GIT_TRACE2_MAX_FILES.

Potential future work:
* Write a message into the sentinel file (should match the requested
   trace2 output format).
* Add a performance test to make sure that contention between multiple
   processes all writing to the same target directory does not become an
   issue.


This looks much nicer than the V1 version.  Having it be a
real feature rather than a test feature helps.

I don't see anything wrong with this.  I do worry about the
overhead a bit.  If you really have that many files in the
target directory, having every command count them at startup
might be an issue.

As an alternative, you might consider doing something like
this:

[] have an option to make the target directory path expand to
   something like "<path>/yyyymmdd/" and create the per-process
   files as "<path>/yyyymmdd/<sid>".

If there are 0, 1 or 2 directories, logging is enabled.
We assume that the post-processor is keeping up and all is well.
We need to allow 2 so that we continue to log around midnight.

If there are 3 or more directories, logging is disabled.
The post-processor is more than 24 hours behind for whatever
reason.  We assume here that the post-processor will process
and delete the oldest-named directory, so it is a valid measure
of the backlog.

I suggest "yyyymmdd" here for simplicity in this discussion
as daily log rotation is common.  If that's still overloading,
you could make it a longer prefix of the <sid>.  And include
the hour, for example.

I suggest 3 as the cutoff lower bound, because we need to allow
2 for midnight rotation.  But you may want to increase it to
allow for someone to be offline for a long weekend, for example.

Anyway, this is just a suggestion.  It would give you the
throttling, but without the need for every command to count
the contents of the target directory.

And it would still allow your post-processor to operate in
near real-time on the contents of the current day's target
directory or to hang back if that causes too much contention.

Feel free to ignore this :-)

Jeff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux