Re: Flurries of 'git reflog expire'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm one of the Bitbucket Server developers. My apologies; I just
noticed this thread or I would have jumped in sooner!

On Thu, Jul 6, 2017 at 6:31 AM, Andreas Krey <a.krey@xxxxxx> wrote:
> On Wed, 05 Jul 2017 04:20:27 +0000, Jeff King wrote:
>> On Tue, Jul 04, 2017 at 09:57:58AM +0200, Andreas Krey wrote:
> ...
>> And what does the process tree look like?
>
> Lots (~ 10) of
>
>           \_ /usr/bin/git receive-pack /opt/apps/atlassian/bitbucket-data/shared/data/repositories/68
>           |   \_ git gc --auto --quiet
>           |       \_ git reflog expire --all
>
> plus another dozen gc/expire pairs where the parent is already gone.
> All with the same arguments - auto GC.

Do you know what version of Bitbucket Server is in use? Based on the
fact that it's "git gc --auto" triggered from a "git receive-pack",
that implies two things:
- You're on a 4.x version of Bitbucket Server
- The repository (68) has never been forked

Depending on your Bitbucket Server version (this being the reason I
asked), there are a couple different fixes available:

- Fork the repository. You don't need to _use_ the fork, but having a
fork existing will trigger Bitbucket Server to disable auto GC and
fully manage that itself. That includes managing both _concurrency_
and _frequency_ of GC. This works on all versions of Bitbucket Server.

- Run "git config gc.auto 0" in
/opt/apps/atlassian/bitbucket-data/shared/data/repositories/68 to
disable auto GC yourself. This may be preferable to forking the
repository, which, in addition to disabling auto GC, also disables
object pruning. However, you must be running at least Bitbucket Server
4.6.0 for this approach to work. Otherwise auto GC will simply be
reenabled the first time Bitbucket Server goes to trigger GC, when it
detects that the repository has no forks.

Assuming you're on 4.6.0 or newer, either approach should fix the
issue. If you're on 4.5 or older, forking is the only viable approach
unless you upgrade Bitbucket Server first.

I also want to add that Bitbucket Server 5.x includes totally
rewritten GC handling. 5.0.x automatically disables auto GC in all
repositories and manages it explicitly, and 5.1.x fully removes use of
"git gc" in favor of running relevant plumbing commands directly. We
moved away from "git gc" specifically to avoid the "git reflog expire
--all", because there's no config setting that _fully disables_
forking that process. By default our bare clones only have reflogs for
pull request refs, and we've explicitly configured those to never
expire, so all "git reflog expire --all" can do is use up I/O and,
quite frequently, fail because refs are updated. Since we stopped
running "git gc", we've not yet seen any GC failures on our internal
Bitbucket Server clusters.

Bitbucket Server 5.1.x also includes a new "gc.log" (not to be
confused with the one Git itself writes) which retains a record of
every GC-related process we run in each repository, and how long that
process took to complete. That can be useful for getting clearer
insight into both how often GC work is being done, and how long it's
taking.

Upgrading to 5.x can be a bit of an undertaking, since the major
version brings API changes, so it's totally understandable that many
organizations haven't upgraded yet. I'm just noting that these
improvements are there for when such an upgrade becomes viable.

Hope this helps!
Bryan

>
> I'd wager that each push sees that a GC is in order,
> and doesn't notice that there is one already running.
>
> - Andreas
>
> --
> "Totally trivial. Famous last words."
> From: Linus Torvalds <torvalds@*.org>
> Date: Fri, 22 Jan 2010 07:29:21 -0800



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux