[PATCH v2 0/1] pack-refs: always refreshing after take the lock file

16657101987@xxxxxxx · Thu, 1 Aug 2019 02:35:43 +0800

From: Sun Chao <16657101987@xxxxxxx>

On Tue, 30 Jul 2019 02:36:35 -0400, Jeff King wrote:

> You can do this step without the fetch, which makes it hit the race more
> quickly. :) Try this:
> 
>   # prime it with a single commit
>   git commit --allow-empty -m foo
>   while true; do
>     us=$(git commit-tree -m foo -p HEAD HEAD^{tree}) &&
>     git update-ref refs/heads/newbranch $us &&
>     git update-ref refs/heads/master $us &&
>     git update-ref -d refs/heads/newbranch &&
>     them=$(git rev-parse master) &&
>     if test "$them" != "$us"; then
>       echo >&2 "lost commit: $us"
>       exit 1
>     fi
>     # eye candy
>     printf .
>   done

Thanks, this could hit the race more quickly and I update it to the
commit log.

> I don't think this is quite the same as racy-git. There we are comparing
> stat entries for a file X to the timestamp of the index (and we are
> concerned they were written in the same second).
> 
> But here we have no on-disk stat information to compare to. It's all
> happening in-process. But you're right that it's a racy stat-validity
> problem.

Yes, I agree with you.

> The stat-validity check here is actually more than the timestamp.
> Specifically it's checking the inode and size. But because of the
> specific set of operations you're performing, this ends up correlating
> quite often:
> 
>   - because our operations involve updating a single ref or
>     adding/deleting another ref, we'll oscillate between two sizes
>     (either one ref or two)
> 
>   - likewise if nothing else is happening on the filesystem, pack-refs
>     may flip back and forth between two inodes (not the same one,
>     because our tempfile-and-rename strategy means we're still using the
>     old one while we write the new packed-refs file).
> 
> So I actually find this to be a fairly unlikely case in the real world,
> but as your script demonstrates, it's not that hard to trigger it if
> you're trying.
>
> I'm not sure the second one actually fixes things entirely. What if I
> have an older refs/heads/foo, and I do this:
> 
>   git pack-refs
>   git pack-refs --all --prune
> 
> We still might hit the race here. The first pack-refs does not pack foo
> (because we didn't say --all), then a simultaneous "update-ref -d" opens
> `packed-refs`, then the second pack-refs packs it all in the same
> second. Now "update-ref -d" uses the old packed-refs file, and we lose
> the ref.

Yes, I agree with you. And in the real word if the git servers has some
3rd-service which update repositories refs or pack-refs frequently may
have this problem, my company's git servers works like this unfortunately.

> So I actually think the best path forward is just always refreshing when
> we take the lock, something like:
> 
> Ultimately the best solution there is to move to a better format (like
> the reftables proposal).

I do not know if we could get the new reftables in the next few versions,
So I commit the changes as you suggested, which is also the same as
another way I metioned in `PATCH v1`:

**force `update-ref -d` to update the snapshot before rewrite packed-refs.**

But if the reftables is comeing soon, please just ignore my PATCH :)

**And thank a lot for your reply, it's great to me, because it's my first
PATCh to git myself :)**

Sun Chao (1):
  pack-refs: always refreshing after take the lock file

 refs/packed-backend.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

-- 
2.22.0.214.g8dca754b1e