Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 19, 2006 at 09:48:29AM -0700, Linus Torvalds wrote:
> On Thu, 19 Oct 2006, Jan Harkes wrote:
> > 
> > If we find a delta against a base that is not found in our repository we
> > can keep it as a delta, the base should show up later on in the
> > thin-pack. Whenever we find a delta against a base that we haven't seen
> > in the received part of the thin pack, but is available from the
> > repository we should expand it because there is a chance we may not see
> > this base in the remainder of the thin-pack.
> 
> Yes, indeed. We can also have another heuristic: if we find a delta, and 
> we haven't seen the object it deltas against, we can still keep it as a 
> delta IF WE ALSO DON'T ALREADY HAVE THE BASE OBJECT. Because then we know 
> that the base object has to be there later in the pack (or we have a 
> dangling delta, which we'll just consider an error).
> 
> So yeah, maybe my patch-series is something we can still save.

It looks like you were really close. When we cannot resolve a delta, we
just write it to the packfile and we don't queue it. If it can be
resolved we write it as a full object.

The only thing that cannot be reliably tracked is the pack index
information. The offsets are trivial, but we cannot calculate the SHA1
for a delta without applying it to it's base, if the base comes later
the existing code could do it, but if it has already been written to the
pack we can't easily track back.

And why add all the extra complexity. Running git-index-pack after
git-update-objects --repack not only generates the correct index without
a problem, it also serves as an extra consistency check and we keep this
code isolated from any possible future changes to the index file format.

I'll try to follow this up with 2 patches, one is an almost trivial
change to your code that makes it write out a pack with all full objects
and resolvable deltas converted to full objects, any unresolved deltas
are expected to be relative to some other object in the same pack.

The rewritten pack is indexed correctly even when I run git-update-index
in a repository that does not contain any of the objects in the thin-pack.
Ofcourse it also works when the objects are available, but the resulting
full pack is considerably bigger since we can find a suitable base for
every delta.

> However, the thing that makes me suspect that it is _not_ saveable, is 
> this:
...
> The answer is: no. It's not trivial. Or rather, it _is_ trivial, but you 
> have to _remember_ all of the actual data for A, B, C and D all the way to 
> the end, because only if you have that data in memory can you actually 
> _recreate_ B, C and D even enough to get their SHA1's (which you need, 
> just in order to know that the pack is complete, must less to be able to 
> create a non-delta version in case it hadn't been).

Only if you want to build the index at the same time, we don't need to
know the SHA1 values for unresolved deltas.

> Anyway, I just pushed the "rewrite-pack" branch to my git repo on 
> kernel.org, so once it mirrors out, if you really want to try to fix up 
> the mess I left behind, there it is:

I think I still left quite a bit of the mess unfixed.

Jan
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]