Re: Race condition between repack and loose-objects maintenance task

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 29, 2022 at 10:21 AM Taylor Blau <me@xxxxxxxxxxxx> wrote:
>
> On Wed, Jun 29, 2022 at 10:16:38AM -0700, Gregory Szorc wrote:
> > > In either case, my recommendation would be to keep those unreachable
> > > objects which haven't yet aged out of the repository around for longer,
> > > which will decrease the likelihood of seeing the race.
> >
> > We had to lower gc.pruneExpire from its default of 1 hour because
> > otherwise this would result in the creation of thousands of loose
> > objects. This is normally acceptable. However, on NFS, the churn from
> > lots of file creation and deletion resulted in acceptable performance
> > degradation. We had to lower gc.pruneExpire to minimize the damage.
>
> Makes sense. Having a shorter gc.pruneExpire makes the race much more
> likely to occur in practice, so I'm glad that we have a plausible
> confirmation that that's what's going on in your repository.
>
> > > See Documentation/technical/cruft-packs.txt for more information about
> > > cruft packs.
> >
> > Yes, I saw this new feature in the Git release this week and am hoping
> > to roll it out to better mitigate this general problem! With cruft
> > packs, I anticipate being able to restore a longer gc.pruneExpire
> > value as they should mitigate the NFS performance badness we're
> > working around. Thank you for confirming it should help with the
> > problem.
>
> That should definitely help. Let me know if you have any questions, and
> thanks for trying it out!

Revising my initial post, not running `loose-objects` is insufficient:
we also need to prevent `incremental-repack` from running while a `git
gc` / `git repack` is in progress. If an `incremental-repack` runs
concurrently with `git repack`, `repack` can error after bitmap
generation but before the temp packfile is renamed with "could not
find 'pack-*.pack'." I suspect this has to do with
`incremental-repack` deleting packs that were initially in place when
`git repack` started running. But I haven't looked into where exactly
repack is failing.

I deployed Git 2.37 and enabled cruft packs via `gc.cruftPacks=true`
in the global gitconfig. I simultaneously increased `gc.pruneExpire`
to 1 day (as we're no longer concerned about each unreferenced object
turning into a single file). And I changed the frequently-invoked `git
maintenance` code to not run `loose-objects` or `incremental-repack`
if a gc.pid file is present (this is still racy but should hopefully
work enough of the time). This appeared to work: `git gc` created a
cruft pack and didn't complain about anything disappearing out from
under it, despite `git maintenance` running `commit-graph` and
`pack-refs` tasks during the long-running `git gc` / `git repack`.

However, I think there is yet another bug at play: running
`incremental-repack` appears to be able to repack the cruft packfile.
In doing so, it deletes its .mtimes file and the metadata inside. This
behavior seems wrong to me: I was expecting the cruft packfile to
linger until the next `git gc` and/or for its mtimes metadata to be
preserved across future repacks. Otherwise, what is the point in
retaining granular mtime metadata? Is this the behavior expected? Or
do cruft packfiles just not work as intended alongside use of the
`incremental-task` maintenance task / multi-pack index in Git 2.37?



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux