On Mon, Jun 01, 2015 at 09:37:17AM +0200, Stefan Näwe wrote: > One of my repos started giving an error on 'git gc' recently: > > $ git gc > error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 > Counting objects: 3052, done. > Delta compression using up to 4 threads. > Compressing objects: 100% (531/531), done. > Writing objects: 100% (3052/3052), done. > Total 3052 (delta 2504), reused 3052 (delta 2504) > error: Could not read 7713c3b1e9ea2dd9126244697389e4000bb39d85 The only error string that matches that is the one in parse_commit(), when we fail to read the object. It happens twice here because `git gc` runs several subcommands; you can see which ones are generating the error if you run with GIT_TRACE=1. I am surprised that it doesn't cause the commands to abort, though. If we are traversing the object graph to repack, for example, we would want to abort if we are missing a reachable object (i.e., the repository is corrupt). > I tried: > > $ git cat-file -t 7713c3b1e9ea2dd9126244 > fatal: Not a valid object name 7713c3b1e9ea2dd9126244 Not surprising, if we don't have the object. What is curious is why git wants to look it up in the first place. I.e., who is referencing it? Either: 1. It is an object that we are OK to be missing (e.g., the UNINTERESTING side of a traversal), and the error should be suppressed. 2. Your repository really is corrupted, and this is a case where we need to be paying attention to the return value of parse_commit but are not. I'd love to see: - the output of "GIT_TRACE=1 git gc" (to see which subcommand is causing the error) - the output of "git fsck" (which should hopefully confirm whether or not there is a real problem) - any mentions of the sha1 in the refs or reflogs. Something like: sha1=7713c3b1e9ea2dd9126244697389e4000bb39d85 cd .git grep $sha1 $(find packed-refs refs logs -type f) - If that doesn't turn up any hits, then presumably it's an object referencing the sha1. We can dig into the objects (all of them, not just reachable ones), like: { # loose objects (cd .git/objects && find ?? -type f | tr -d /) # packed objects for i in .git/objects/pack/*.idx; do git show-index <$i done | cut -d' ' -f2 } | # omit blobs; they are expensive to access and cannot have # reachability pointers git cat-file --batch-check='%(objecttype) %(objectname)' | grep -v ^blob | cut -d' ' -f2 | # now get all of the contents, and look for our object; this is # going to be slow, since it's one process per object; but we # can't use --batch because we need to pretty-print the trees xargs -n1 git cat-file -p | less +/$sha1 I would have guessed this was maybe caused by trying to traverse unreachable recent objects for reachability. It fits case 1 (it is OK for us to be missing these objects, but we might accidentally complain), and it would probably happen twice during a gc (once for the repack, and once for `git prune`). But that code should not be present in older versions of msysgit, as it came in v2.2.0 (and I assume "older msysgit is v1.9.5). And if that is the problem, it would follow a copy of the repo, but not a clone (though I guess if your clone was on the local filesystem, we blindly hardlink the objects, so it might follow there). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html