Issue with git log and reference repositories using --dissociate and --filter=blob:none

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all, I am working with an application that uses a reference cache for repositories. We use that cache on subsequent clones to save on some of the overheads of cloning the repositories each time we need them, which is fairly frequently.

We've ran into an issue with git log combined with our use of the reference repository (and possibly --dissociate on the clone) where the commit history cannot be fetched (for a while) with the error:
> error: Could not read 2ffba4df2f9ec9df145fcdd84fe20a3d934b4555
> fatal: Failed to traverse parents of commit 7603ede45da4d396f2641b01e2ef3e13d49b572f

This is part of a user facing feature but thankfully it's extremely infrequent as of right now.

We do a partial clone of the repository with the following
> git -c core.symlinks=false -c gc.auto=0 clone --reference ./$reference_repo --dissociate --filter=blob:none --no-checkout -b master https://github.com/mattcree/dissociate-clone-issue ./$clone_repo

Then we try to get the log 
> git -c core.symlinks=false -c gc.auto=0 log -100 --first-parent --pretty="%H %an %ct %s" master b3447a67238c760aa2845d32e5eb95b96e67c733

I set the following trace options to help debugging
> GIT_TRACE=2 GIT_CURL_VERBOSE=2 GIT_TRACE_PERFORMANCE=2 GIT_TRACE_PACK_ACCESS=2 GIT_TRACE_PACKET=2 GIT_TRACE_PACKFILE=2 GIT_TRACE_SETUP=2 GIT_TRACE_SHALLOW=2

>From what I can tell, what is going on here is the following
1. We cloned the repository using --dissociate which forces a repack of the cloned repository
2. The clone completes fast (on git 2.40+) but in the background 'git-remote-https' is running
3. The bug appears while I request the log during this time
4. When the 'git-remote-https' process ends, the log can be requested successfully

>From what I can tell git repack is called during the dissociate, which I guess forces an update to the pack files. For this reason the pack file the commit objects are in may not exist at the time when the git log is called.

This means when going through the commits, eventually it does not find one here https://github.com/git/git/blob/492ee03f60297e7e83d101f4519ab8abc98782bc/revision.c#L1106 -- this code path does seem aware of the possibility of missing objects but in this case the arguments it has been given clearly don't stop it from failing here.

When we remove '--dissociate' this issue does not appear. The decision to use dissociate was mainly just driven by perceived safety (I did not work on this part and can't say either way) -- we may fix it for now by removing this.

When we remove '--filter=blob:none' the issue also do not appear.

When running the log command, Git 2.39.3 appears to print quite a bit of http logging e.g.
> Info: [HTTP/2] [1] OPENED stream for https://github.com/mattcree/dissociate-clone-issue/info/refs?service=git-upload-pack

This does not appear with 2.40+. I believe this is either a bug or a poorly handled scenario caused by the lazy retrieval of pack files but I can't say for sure. This is the second time I'm coming to the mailing list with a 'is this a bug?' type question -- it appears to me that it is, but our use case is fairly niche so I wasn't sure if we found something here. I've dived in a bit with the code and there's a lot of moving parts which I am not familiar with, so I may have missed something, but I think I've covered the main issues and I have supplied a recreation below.

The following gist contains a recreation including the reference repository state, the repository to clone, and a script for repeating the situation, and a selection from the log output of running with `--filter=blob:none` -- however it's probably easier for you to run the script yourself for the full output.
> https://gist.github.com/mattcree/b5fcd364c97219465f37b62598db36b0






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux