Re: Simultaneous gc and repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 13, 2017 at 10:31 AM, David Turner <novalis@xxxxxxxxxxx> wrote:
> Git gc locks the repository (using a gc.pid file) so that other gcs
> don't run concurrently. But git repack doesn't respect this lock, so
> it's possible to have a repack running at the same time as a gc.  This
> makes the gc sad when its packs are deleted out from under it with:
> "fatal: ./objects/pack/pack-$sha.pack cannot be accessed".  Then it
> dies, leaving a large temp file hanging around.
>
> Does the following seem reasonable?
>
> 1. Make git repack, by default, check for a gc.pid file (using the same
> logic as git gc itself does).
> 2. Provide a --force option to git repack to ignore said check.
> 3. Make git gc provide that --force option when it calls repack under
> its own lock.
>

What about just making the code that calls repack today just call gc
instead? I guess it's more work if you don't strictly need it but..?

Thanks,
Jake

> This came up because Gitlab runs a repack after every N pushes and a gc
> after every M commits, where M >> N.  Sometimes, when pushes come in
> rapidly, the repack catches the still-running gc and the above badness
> happens.  At least, that's my understanding: I don't run our Gitlab
> servers, but I talked to the person who does and that's what he said.
>
> Of course, Gitlab could do its own locking, but the general approach
> seems like it would help other folks too.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]