Re: how to remove unreachable objects?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 19, 2011 at 05:40:03PM -0700, Junio C Hamano wrote:

> > Does that work? I had the impression from the documentation that the
> > arguments are purely about the reachability analysis, and that the
> > actual corruption/correctness checks actually look through the object db
> > directly, making sure each object is well-formed. Skimming cmd_fsck
> > seems to confirm that.
> 
> You are right that you may see "corrupt object" for unreachable from the
> tips in the object store, but I was talking more about verifying
> everything that is needed for reachability analysis from the given tips
> can be read, iow, "missing object" errors, lack of which would mean you
> can salvage everything reachable from the refs involved in the traversal.

True. Though one could also do that with "git log", and it would be much
cheaper (since each trial you run with git-fsck is going to actually
fsck the object db, which is expensive).

I can't help but think the right solution there is something like:

  1. If the corrupted or missing object is a blob or tree, figure out
     which commits reference it with something like:

       a. Create a set B of bad objects (blobs or trees).

       b. For each tree in the object db, open and see if it contains
          any elements of B. If so, add the tree to another set, B'.

       c. If B' is empty, done. Otherwise, add elements from B' to B and
          goto step (b).

       d. For each commit in the object db, open and check the tree
          pointer. If it points to an element of B, then the commit is
          bad.

  2. If the object is a commit, or if you arrived at a set of bad
     commits through step (1), then use "branch --contains" on the
     bad commits.

which is algorithmically efficient (though probably slow if you had to
cat-file each tree). It might be a handy special command, though (I have
seen people ask for "which part of history references this blob" on
occasion). I've never bothered writing it because I've never had a
corrupt object. :)

Anyway, that is perhaps not relevant to your point. But I do think that
fsck with arguments is more likely to confuse someone than to actually
be part of a productive use-case. I have no problem with deprecating or
removing it.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]