Re: [PATCH] use delta index data when finding best delta matches

Nicolas Pitre <nico@xxxxxxx> · Thu, 27 Apr 2006 21:56:00 -0400 (EDT)

On Thu, 27 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@xxxxxxx> writes:
> 
> > This patch allows for computing the delta index for each base object 
> > only once and reuse it when trying to find the best delta match.
> >
> > This should set the mark and pave the way for possibly better delta 
> > generator algorithms.
> >
> > Signed-off-by: Nicolas Pitre <nico@xxxxxxx>
> 
> My understanding is that theoretically this should not make any
> difference to the result, and should run faster when the memory
> pressure does not cause the machine to thrash.  However,....
> 
> I am seeing some differences.  Even with the smallish "git.git"
> repository, packing is slightly slower, and the end result is
> smaller.

Well, I changed some euristics a bit.

> Not that I am complaining that it produces better results with a
> small performance penalty.  I am curious because I do not
> understand where the differences are coming from, and I was
> reluctant to merge it in "next" until I understand what is going
> on.
> 
> But I think I know where the differences come from:
> 
> -	sizediff = oldsize > size ? oldsize - size : size - oldsize;
> +	sizediff = src_size < size ? size - src_size : 0;

Right.  The idea is that when the delta source index has to be computed 
each time, if the target buffer is really small then we spend more time 
computing that index than anything else.

But when the delta index is computed only once and already available 
anyway, we don't lose much attempting a delta with a small target buffer 
since the delta computation is non-existent at that point and the actual 
delta generation will be quick if the target buffer is small.

> There is another "omit smaller than 50" difference but that
> should not trigger -- we do not have files that small.

Right.  And if such small files show up they won't waste window space.

Nicolas
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html