Re: [PATCH 2/4] pack v4: add v4_size to struct delta_base_cache_entry

Nicolas Pitre <nico@xxxxxxxxxxx> · Sat, 14 Sep 2013 00:22:16 -0400 (EDT)

On Fri, 13 Sep 2013, Nicolas Pitre wrote:

> On Fri, 13 Sep 2013, Duy Nguyen wrote:
> 
> > On Fri, Sep 13, 2013 at 8:27 PM, Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
> > > However I can see that, as you say, the same base object is repeatedly
> > > referenced.  This means that we need to parse it over and over again
> > > just to find the right offset where the needed entries start.  We
> > > probably end up skipping over the first entries in a tree object
> > > multiple times.  And that would be true even when the core code learns
> > > to walk pv4 trees directly.
> > >
> > > So here's the beginning of a tree offset cache to mitigate this problem.
> > > It is incomplete as the cache function is not implemented, etc.  But
> > > that should give you the idea.
> > 
> > Thanks. I'll have a closer look and maybe complete your patch.
> 
> I've committed an almost final patch and pushed it out.  It is disabled 
> right now due to some bug I've not found yet.  But you can have a look.

Found the bug.

The cache is currently updated by the caller.  The caller may ask for a 
copy of 2 entries from a base object, but that base object may itself 
copy those objects from somewhere else in a larger chunk.

Let's consider this example:

tree A
------
0 (0) copy 2 entries from tree B starting at entry 0
1 (2) copy 1 entry from tree B starting at entry 3

tree B
------
0 (0) copy 6 entries from tree C starting at entry 0
1 (6) entry "foo.txt"
2 (7) entry "bar.txt"

Right now, the code calls decode_entries() to decode 2 entries from tree 
B but those entries are part of a copy from tree C.  When that call 
returns, the cache is updated as if tree B entry #2 would start at 
offset 1 but this is wrong because offset 0 in tree B covers 6 entries 
and therefore offset 1 is for entry #6.

So this needs a rethink.

I won't be able to work on this for a few days.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html