Re: git packing leaves unpacked files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 26 Sep 2006, Andy Whitcroft wrote:
>
> I was just looking at my kernel repository and noticed that even after a
> git repack -a -d I have some loose files.  A quick look at repack
> doesn't seem to explain why some are either not packed or are kept unpacked.
> 
> Is this something I should be expecting?

Depending on what you're doing, yes.

You can often get a hint of what is going on by just running 
"git-fsck-objects" and seeing the "dangling" objects - objects that exist, 
but are not reachable.

There are a few things that cause dangling objects quite normally:

 - If you use "git update-index" to update the index half-way, and then do 
   more work, and use "git update-index" again (or commit), then the 
   half-way work will visible be in the form of dangling blobs. You can 
   just do a "git cat-file -p <blobname>" and see it, and maybe you'll 
   recognize that it was something you were about to commit, but never 
   did, because you did further development.

 - if you ever rebase any branch in the project, or do "git reset" to set 
   it to some old point, or delete a branch, dangling commits are very 
   much to be expected.

 - Even if _you_ didn't rebase anything, if the project you track rebases 
   itself, you'll get dangling objects because you had commits that became 
   unreachable when they were replaced by new history.

   My kernel tree doesn't do that, but some other ones occasionally do, 
   and git itself (in the "pu" branch) obviously does all the time.

   This is often the most common reason, especially if you follow 
   Junio's git tree.

   The most common sign of this is that there's a few dangling commits, 
   and when you use gitk to examine them, you see old valid commits that 
   just aren't reachable any more.

 - if you do any merges at all, and they've conflicted or they have had 
   more than one parent and the recursive merger has generated an 
   intermediate version of the tree, you'll have the merge process leave 
   the objects of those intermediate merges around as dangling left-overs 
   that aren't actually reachable from the end result of the merge.

   The most common form of this is that you see a few pending "blob"s, and 
   when you do "git cat-file -p <sha1> | less -S" on the blob-file, you'll 
   generally find a conflict marker in it (ie the "<<<<" "====" ">>>>" 
   things that a three-way merge leaves behind). You might also have a 
   whole dangling tree due to this.

 - if you use the rsync:// protocol, you'll often end up getting objects 
   that aren't reachable from the heads _you_ have, because you got the 
   whole object database from somebody else that had other heads (or, you 
   might get the dangling objects that they had due to any of the reasons 
   above).

   The rsync:// protocol simply doesn't do any git-level reachability 
   analysis, so it just gets everything, regardless.

Hmm. Those are tha main reasons I can think of. There may be other cases, 
but I think these are the main ones, and I think any other cases end up 
being just variations on the same kind of theme.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]