small question about the repack algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  I've trying to see if that optimization was used but I was somehow
unable to find if it was the case, as the code is a bit tough :)

  I was wondering if the repacking window was using triangle inequality
to discard trying some costly deltas (I assume that what costs the most
in the repacking is computing the delta). I mean, if you consider the
"size" of a delta, I'm almost sure that it's very near a distance.

  So assuming that we know the delta sizes between any pair of reference
objects in the window, well, if an object we want to delta against the
window Od are near one reference O1 enough, for each Oi in the window
that holds: len(δ(O1, Oi)) > 2 * len(δ(Od, O1)), then it's not worth
investigating.

  I've seen quite many heuristics tried to avoid computing a delta at
the beginning of builtin-pack-objects.c:try_delta, but I don't seem to
find this one used, whereas it would even encompass most of the
heuristics here. I mean the object-size based heuristics are (if I'm
correct) a special case of the previous, e.g. I think one of the
heuristics is to not trying a delta if:
  | len(Oi) - len(Od) | > 2 * len(δ(Od, O1))

But it's just a subcase of the former because
  len(δ(Od, Oi)) >= | len(Oi) - len(Od) |


  Of course it needs us to know how objects in the same window diff
against one each other, of have a good estimation of it (having not too
bad lower estimates of the real deltas is an option here, as it just
makes the test a bit less discriminant, but can allow us to have a
trade-off between computing each crossed deltas in the window and
dropping tried deltas early). Note that we don't need to store the
actual deltas, just their length (or a more complicated score if we want
-- probably -- take the chain depth into account, we just need that
score to look like a distance enough).


  If it's already done, then don't mind me, though I'd like to be
pointed to where so that I can grok that code a bit more :)
-- 
·O·  Pierre Habouzit
··O                                                madcoder@xxxxxxxxxx
OOO                                                http://www.madism.org

Attachment: pgpavtVKZDHiw.pgp
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux