Re: [PATCH v3] Document pack v4 format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 6 Sep 2013, Duy Nguyen wrote:

> On Fri, Sep 6, 2013 at 8:25 PM, Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
> > On Fri, 6 Sep 2013, Duy Nguyen wrote:
> >
> >> On Fri, Sep 6, 2013 at 10:23 AM, Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
> >> > On Fri, 6 Sep 2013, Nguy­n Thái Ng÷c Duy wrote:
> >> >
> >> >>
> >> >> Signed-off-by: Nguy­n Thái Ng÷c Duy <pclouds@xxxxxxxxx>
> >> >> ---
> >> >>  Should be up to date with Nico's latest implementation and also cover
> >> >>  additions to the format that everybody seems to agree on:
> >> >>
> >> >>   - new types for canonical trees and commits
> >> >>   - sha-1 table covering missing objects in thin packs
> >> >
> >> > Great!  I've merged this into my branch with the following amendment:
> >>
> >> I'd like to propose another change in the format to basically limit
> >> the use of sha1ref format "\0<SHA-1>" to tree entries only. All forms
> >> of deltas must have non-zero sha1 index (i.e. reference to SHA-1
> >> table). It will simplify handling code, and I think it makes sense too
> >
> > Why?
> >
> > This is still some artificial limitation and I tend not to like them.
> >
> > "Simplifying handling code" is not a good enough reason on its own,
> > especially if it reduce flexibility for possible future layout
> > optimizations.
> >
> > In what way is the code more difficult?
> 
> That'll be two ways of linking to another in-pack object. The linked
> object must be in the pack anyway, its sha-1 should be present in the
> sha-1 table. "\0<sha1>" is a longer encoding for nothing. If the
> linked sha1 is not in the pack (similar to the old ref-delta), it
> makes the pack depend on another one, which is what we've been
> avoding.

OK.  Let's say that if the needed SHA1 exists in the table then the 
sha1ref should use an index into that table.  That's a recommendation we 
can make, just like we suggest to have the ident/path tables sorted by 
usage frequency.

But even if index 0 with a SHA1 is used inline for an object in the same 
pack, it is still theoretically a valid pack even if somewhat wasteful.  
Maybe some tools in the future will benefit from this flexibility to 
simplify their implementation.

You mention that only tree objects should use this encoding, but commits 
should be allowed to use it as well.  Suppose that a commit reverts some 
changes to a pre-existing tree state.  That tree might already be in 
another pack and we shouldn't have to copy it into the current pack.  
Same goes for commit parents.  Etc.

> In your code you reject sha1ref starting with zero too
> (sha1_file.c::get_base_delta and packv4-parse.c::decode_entries)

Yeah... because I wrote the minimum code to make it work with the 
current encoder.  But the decoder could be more lenient than that.  It's 
just a matter of adding a call to find_pack_entry_one() when the sha1ref 
index is 0.


Nicolas

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]