On Wed, 16 Dec 2009, Eric Paris wrote: > On Wed, 2009-12-16 at 16:06 -0500, Nicolas Pitre wrote: > > On Wed, 16 Dec 2009, Eric Paris wrote: > > > > > On Tue, 2009-12-15 at 22:03 -0500, Nicolas Pitre wrote: > > > > On Mon, 14 Dec 2009, Eric Paris wrote: > > > > > > > > > The alternative repo is slowing pushing up to that same location. That > > > > > tar is 855838982, so just a tad bit smaller. > > > > > > > > It doesn't appear to be complete yet, and not progressing either. > > > > > > The alternative repo is now available (but the original is down) > > > > > > I tried to run git gc --aggressive last night while I slept and got this > > > as output, maybe it helps point to a solution/problem? The git reflog > > > portion ran for 5 hours and 36 minutes and appears to have finished. > > > > Yes. I was able to reproduce your issue. And because of the *horrible* > > repository packing, the reflog expiration process is taking ages when > > determining object reachability at a rate of one reflog entry every 2 > > seconds or so. With 4214 entries for the fsnotify-syscall branch, and > > 1352 entries for the fsnotify branch, this already takes up asignificant > > portion of the actual run time. I'm sure if your repository was > > properly packed this would take less than a minute. > > I'm guessing this is a result of stgit.? These branches really should > be just a branch from a tag (which exists in kernel-1) and about 30-50 > patches linearly applied on top. I don't know how I get that many > objects. I'm guessing many/most of them are crap that should be able to > be cleaned/deleted entirely as the rebasing/pushing/poping/updating that > stgit does under the covers should have rendered them pointless. Not > really sure when/how that should/could have happened. Possible. Commit operations (including patch applications) always create loose objects because this is fast, with the expectation that they get collected in a pack later. > Should I be running git-gc every night? This is certainly a good thing to do given your heavy stgit usage. > > Now, repacking doesn't work because... > > > > > $ git gc --aggressive > > > error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2 > > > error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f > > /me is pretty git dumb, but is there some way to figure out the parents > or children of these? I just trolled through all of my directories > doing git show and didn't get any hits. I guess I'll just clean up and > start over.... Moving the reflog data aside (i.e. mv .git/logs .git/logs.bak) it seems that d936ff8 is not referenced anymore. I found the other one as follows: First I tried $ git rev-list --all --objects This resulted in: [...] 4f7911b0b0dbd187131a109cf00161a0c6a9d727 arch/x86 ea868257c1eabc31e0ea7941efa42b543978b3fa arch/x86/kvm a0c11ead723956c667172a9f3fb6787684fe7ff5 arch/x86/kvm/paging_tmpl.h b556b6aad8b1aacfecb1dd4a56dbd389674687b5 arch/x86/kvm/x86.c 68a9733ae3315d7e2bfec2037dfeee4db8a6f6a1 drivers error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f fatal: bad tree object 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f Because of the way objects are enumerated, we can be pretty sure that the bad tree object is referenced by the tree object 68a9733a corresponding to drivers/. Let's verify that: $ git ls-tree 68a9733a 100644 blob 00cf9553f74065291612b0971337f79995933a06 Kconfig 100644 blob c1bf41737936ab00be4a87563a0bb0638074785d Makefile 040000 tree d4e847de9bf2450842936582ea7cc6778413825b accessibility 040000 tree 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f acpi [...] Yep, we found it there. So the missing tree object corresponds to drivers/acpi/. So to find the latest commit to which this particular tree object is referenced by, we just need to look at the same rev-list output above (piped into less is handy here) and scroll up until an object with no name is found. This would usually be the first root tree object referencing the named objects that follow. Here I get aafb68eb. To be sure, let's list it so to confirm it really contains a reference to the 68a9733a drivers tree: $ git ls-tree aafb68eb [...] 040000 tree 68a9733ae3315d7e2bfec2037dfeee4db8a6f6a1 drivers [...] So yes, we've got the right root tree object. Now finding the corresponding commit should be easy: $ git log --all --pretty=raw Then within less, a search for aafb68eb brings us to this: commit 2e765e9c87a337131aad3014f9a7e5e878c7d0a0 tree aafb68eb84f96c9ab5697c6e8d10d5006d1e7209 parent a2c2de42295b3ac29758f454a7072338e5555ca3 author Eric Paris <eparis@xxxxxxxxxx> 1237233261 -0400 committer Eric Paris <eparis@xxxxxxxxxx> 1237233261 -0400 refresh 64d34c511b1125d9efd2926e683e019f15dec5b4 So this is referenced by a commit that you made on the 1237233261th second since January 1, 1970 i.e. 2009-03-16 19:54:21 +0000 which is quite a while ago. Or given the nature of the commit log, this is probably some stgit branch. Note that the missing tree didn't necessarily appear with that commit. Because of the recency ordering from rev-list, all we can say is that this is the last commit on that particular branch to reference that tree, but it might have been introduced in the repository way before that point in time. Now let's try to find out what branch(es) actually link(s) to this commit: $ git branch -a --contains 2e765e9c This comes empty. This is because 'git branch' looks only in the refs/heads/ and refs/remotes namespace (or only one of them without -a). Scripting something around 'git for-each-ref' and 'git merge-base' could be done, such as: TARGET=2e765e9c87a337131aad3014f9a7e5e878c7d0a0 git for-each-ref refs/* | while read sha1 type ref; do if [ "$(git merge-base $sha1 $TARGET)" = "$TARGET" ]; then echo "referenced by $type $ref" fi done But this is slow, for the same reason as 'git reflog expire' above. But letting it run for a while should give you at least one answer. > > Of course, usage of alternates is recommended _only_ with repositories > > that are stable, i.e. don't ever add repositories to > > .git/objects/info/alternates if those repositories are rewinded/rebased > > and/or branches in them are deleted/replaced. That could be a reason > > why some objects are now missing from the repository using alternates. > > So I'm not sure how I did things wrong. my kernel-1 has those bunch of > remotes. The linux-next remote, like I said, basically rebases to > linus' tree, then merges 150 random branches. It tags that tree every > day and I pull those tags. So I would never expect any objects from > those remote trees to ever disappear. Right. > Now I created branches in kernel-1 and I certainly have done lots of > things like so > > git checkout -b testing remotes/linux-next/master > [edit] > git commit -a > git checkout -b testing1 remotes/linux-next/master > git branch -D testing > > My assumption though was that this wouldn't ever affect my other > repositories. My other repository branches always started by checking > out a branch with remotes/*/* as the base. > > My understanding was that I would only run into problems if I used > something on a branch I created myself in the alternatives repo in other > repos (and I didn't remove remotes) > > I guess it's not impossible to believe that at some point in time i > would have exported patches to and mbox from kernel-1 and applied them > to kernel-2 or vice versa. I guess this would end up with the same > objects, right? Then if I deleted the branch in kernel-1 I would have > problems in kernel-2? Eventually, yes. After a while the auto repack in kernel2 would notice that some objects are in kernel1 already and purge them from kernel2. And if those objects were part of a deleted branch then kernel1 would get rid of those objects too once the reflog with a reference to that deleted branch expires. The unsuspecting kernel2 repo then gets broken. > I guess I'll rebuild my setup > > new kernel-alt has just the remotes, and my kernel-1,2,3 all alt to it > I'll never have local branches in my kernel-alt > I'll run git-gc every night > I'll hope to never have problem again. > > Sound good? Yes. And make sure not to fetch rebasing repositories, such as linux-next, into kernel-alt without keeping a tag for each fetched state otherwise you'll accumulate unreferenced objects which the other repositories might rely upon. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html