Warnings in gc.log can prevent gc --auto from running

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I've found some undesirable behavior with regards to the
behavior of `git gc --auto`. The tl;dr is that a warning message written
to gc.log can result in `git gc --auto` effectively disabling itself for
gc.logExpiry. The problem is easier to trigger in 2.22 as a result of
enabling bitmap indices for bare repositories by default and the
behavior can easily result in performance degradation, especially on
servers.

`git gc --auto` will stop itself from running if a gc.log file newer
than gc.logExpiry (1 day by default) exists. The intention of this
behavior seems reasonable enough. However, it is relatively easy for a
relatively harmless gc.log file to exist and for that relatively
harmless gc.log file to effectively disable `git gc --auto`.

For example, if bitmap indices are being produced (this is the default
behavior for bare repositories in Git 2.22) and the user has taken any
action that would result in a `git gc` producing multiple packfiles
(setting gc.bigPackThreshold, setting pack.packSizeLimit, annotating a
packfile with a .keep file, etc) then a message like "warning: disabling
bitmap writing, as some objects are not being packed" or "warning:
disabling bitmap writing, packs are split due to pack.packSizeLimit" may
be written to gc.log. This warning message will result in the presence
of a gc.log file, which will cause `git gc --auto` to stop doing
meaningful work until gc.logExpiry has passed or the gc.log is cleaned
up out-of-band.

The practical impact of this behavior is that an environment having only
made minor tweaks to tweak packfile behavior may end up inadvertently
disabling `git gc --auto` and having excessive amounts of packfiles and
loose object files accumulate since `git gc --auto` isn't running. This
can result in performance degradation, especially for repositories
receiving hundreds or thousands of pushes a day - ask me how I know :)

I was able to work around this in a server environment by removing
gc.log if the contents were "harmless" warning messages, unblocking `git
gc --auto`. However, the solution is a bit brittle. As an end-user of
Git, I would prefer a `git gc --auto` execution mode that was less
sensitive to the presence of non-fatal messages in gc.log. Lowering the
value of gc.logExpiry is also a somewhat reasonable solution. I /think/
you could even make the value "now" to effectively disable the gc.log
check, but I haven't tested this. I don't feel great about that
workaround though, as if there is an actual gc/repack error, I'd like to
know about it instead of sweeping it under the rug by continuously
deleting the gc.log file. I'm also not keen on triggering `git gc --auto
--force` because --force will ignore lock files and I like respecting
lock files.

I don't prescribe to know the best way to solve this problem. I just
know it is a footgun sitting in the default Git configuration. And the
footgun became a lot easier to fire with the introduction of warning
messages related to bitmap indices and again when bitmap indices were
enabled by default for bare repositories in Git 2.22.

Gregory Szorc
gregory.szorc@xxxxxxxxx




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux