Re: [PATCH 4/8] git-repack --max-pack-size: add fixup_header_footer()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 9 Apr 2007, Dana How wrote:

> On 4/9/07, Nicolas Pitre <nico@xxxxxxx> wrote:
> > On Mon, 9 Apr 2007, Dana How wrote:
> > > Wouldn't the following address the "object count unknown
> > > at the start of sequential pack writing" problem:
> > >  Write 0 for object count in the header. This is a flag to look for
> > >  another header of same format just before the final SHA-1 which
> > >  has the correct count. The SHA-1 is still a checksum of everything
> > >  before it and no seeking/rewriting is needed on generation.
> > 
> > No.  You really wants to know up front how many objects a pack contains
> > when streaming it.  And this is not only for packs written to stdout.
> OK, let me ask a dumb question and flog one last additional obvious idea.
> 
> Does your wanting to know stem from more than wanting
> to stick to one malloc of all the object info at once?

That, plus progress reporting when fetching.  And progress reporting is 
probably the most important thing for the user experience.

> My suggestion quoted above is actually a change to the .pack format.
> With all the other ideas for .pack format changes floating around,
> let me withdraw that and suggest a simpler one: write a "0" in the header,
> and terminate the pack with a sentinel in object format before the final
> SHA-1s.
> The sentinel would be type=OBJ_NONE/length=0, i.e. a null byte.
> "Not much" would need to be updated to tolerate it and
> you could count objects while looking for it (if header has 0)
> during normal processing.  (I'm reacting to your word "streaming".)

When streaming you really want to know up front how much to expect / 
wait for.

> BTW,  I've caught up on reading the mailing list archives,
> but I don't recall seeing any overview of the objectives of pack v2/v3/v4.
> Does that exist any where? I didn't see it in Documentation or
> Documentation/technical. It would probably reduce uninformed
> questions like the above.  I've deduced rationales for what
> miscellaneous details I have seen,
> except moving the SHA-1s from .idx to .pack (?).

I'm sure this exists in the archive as I recall sending a summary about 
that a while ago.

In short:

 - Pack v1 was the initial implementation from Linus.  It had some flaws
   and didn't exist for more than 2 or 3 days.  It was never used in any 
   official release.

 - Pack v2 is what GIT still produces today.

 - Pack v3 was created when a bit in the delta encoding was redefined.
   See commit d60fc1c8649f8 for the details.  Because it caused too much
   compatibility issues when we attempted to enable pack v3 at the time, 
   we reverted pack generation to v2.

 - Pack v4 has a much larger scope.  This is the WIP from Shawn Pearce 
   and I and I know for sure Shawn already posted a detailed design 
   overview to the list already.


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]