Re: [PATCH 0/4] Honor core.deltaBaseCacheLimit during index-pack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 14 Jul 2008, Andreas Ericsson wrote:

> Johannes Schindelin wrote:
> > Hi,
> > 
> > On Mon, 14 Jul 2008, Andreas Ericsson wrote:
> > 
> > > Sorry for being clueless here, but why does the older versions need
> > > to be kept in-memory anyway? Aren't we applying the delta each time
> > > we find one, repeatedly creating a new base-object in-memory for
> > > each delta? If we aren't doing that, why do we need more than just
> > > a small amount of memory just for keeping the delta?
> > 
> > Think of a delta chain of 49 delta objects, 1 base object.  Now reconstruct
> > all of the objects.
> > 
> > If you do it one by one, not storing the intermediate results, you end up
> > applying 1 delta for the first, 2 for the second (first the first, then the
> > second), and so on, in total 1 + 2 + 3 + ... + 49 = 49 * 50 / 2 = 1225
> > times.
> > 
> > Compare that to the 49 times when reusing the intermediate results.
> > 
> 
> That's only true if you discard the result of applying D1 to DB though.
> What I'm wondering is; Why isn't it done like this (pseudo-code):
> 
> while (delta = find_earlier_delta(sha1)) {
> 	if (is_base_object(delta)) {
> 		base_object = delta;
> 		break;
> 	}
> 	push_delta(delta, patch_stack);
> }
> 
> while (pop_delta(patch_stack))
> 	apply_delta(base_object, delta);
> 
> 
> 
> where "apply_delta" replaces the base_object->blob with the delta
> applied, releasing the previously used memory?
> 
> That way, you'll never use more memory than the largest ever size
> of the object + 1 delta at a time and still not apply the delta
> more than delta_chain_length-1 times.

Those delta chains aren't simple chained lists.  They are trees of 
deltas where one object might be the base for an unlimited number of 
deltas of depth 1, and in turn each of those deltas might constitute the 
base for an unlimited number of deltas of depth 2, and so on.

So what the code does is to find out which objects are not deltas but 
are the base for a delta.  Then, for each of them, all deltas having 
given object for base are found and they are recursively resolved so 
each resolved delta is then considered a possible base for more deltas, 
etc.  In other words, those deltas are resolved by walking the delta 
tree in a "depth first" fashion.

If we discard previous delta bases, we will have to recreate them each 
time a delta sibling is processed.  And if those delta bases are 
themselves deltas then you have an explosion of delta results to 
re-compute.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux