Re: Missing Refs after Garbage Collection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 21, 2012 at 05:41:43PM -0800, Earl Gresh wrote:

> I have observed that after running GC, one particular git repository
> ended up with some missing refs in the refs/changes/* namespace the
> Gerrit uses for storing patch sets. The refs were valid and should not
> have been pruned. Concerned about loosing data, GC is still enabled
> but ref packing is turned off. Now the number of refs has grown to the
> point that it's causing performance problems when cloning the project.
> 
> Is anyone familiar with git gc deleting valid references? I'm running
> git version 1.7.8. Have there been any patches in later git releases
> that might address this issue ( if it is a git problem )?

I have never seen deletion, but I did recently find a race condition
with ref packing that caused rewinds, where:

  1. Two processes simultaneously repack the refs.

  2. At least one process is using an "old" version of the pack-refs
     file. That is, it cached the packed refs list earlier in the
     process and is now rewriting it based on that cached notion.

  3. The first process takes the lock, packs refs, drops the
     lock, and then deletes the loose versions. The simultaneous packer
     then takes the lock, overwrites the packed-refs file with a stale
     copy from its memory, and then releases the lock. We're left with
     the stale copy in pack-refs, and deleted loose refs.

In my case, it looked like a rewind, because the stale, memory-cached
refs had the old version. But if you have a ref which was not previously
packed, it would appear to have been deleted.

The tricky thing about triggering this race is that step (2) needs a
process which has previously read and cached the packed-refs, and then
decided to pack the refs. The "git pack-refs" command does not do this,
because it starts, packs the ref, and exists. But processes which delete
a ref need to rewrite the packed-refs file (omitting the deleted ref),
and depending on the process, may have previously read and cached the
packed refs file. The obvious candidate is "receive-pack".

So this may be your culprit if:

  1. This is a repo people are pushing into via C git.

  2. You simultaneously run "git pack-refs" (or "git gc") while people
     may be pushing.

You mentioned Gerrit, so I wonder if people are actually pushing via C
git (I thought it used JGit entirely). Or perhaps JGit has the same bug.
My fix (which is not yet released in any git version) is here:

  http://article.gmane.org/gmane.comp.version-control.git/211956

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]