Re: Delta compression not so effective

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/5/2017 19:14, Linus Torvalds wrote:
On Sat, Mar 4, 2017 at 12:27 AM, Marius Storm-Olsen <mstormo@xxxxxxxxx> wrote:
I guess you could do the printout a bit earlier (on the
"to_pack.objects[]" array - to_pack.nr_objects is the count there).
That should show all of them. But the small objects shouldn't matter.

But if you have a file like

   extern/win/FlammableV3/x64/lib/FlameProxyLibD.lib

I would have assumed that it has a size that is > 50. Unless those
"extern" things are placeholders?

No placeholders, the FlameProxyLibD.lib is a debug lib, and probably the largest in the whole repo (with a replace count > 5).


I do wonder if your dll data just simply is absolutely horrible for
xdelta. We've also limited the delta finding a bit, simply because it
had some O(m*n) behavior that gets very expensive on some patterns.
Maybe your blobs trigger some of those case.

Ok, but given that the SVN delta compression, which forward-linear only, is ~45% better, perhaps that particular search could be done fairly cheap? Although, I bet time(stamps) are out of the loop at that point, so it's not a factor anymore. Even if it where, I'm not sure it would solve anything, if there's other factors also limiting deltafication.


The diff-delta work all goes back to 2005 and 2006, so it's a long time ago.

What I'd ask you to do is try to find if you could make a reposity of
just one of the bigger DLL's with its history, particularly if you can
find some that you don't think is _that_ sensitive.

Looking at it, for example, I see that you have that file

   extern/redhat-5/FlammableV3/x64/plugins/libFlameCUDA-3.0.703.so

that seems to have changed several times, and is a largish blob. Could
you try creating a repository with git fast-import that *only*
contains that file (or pick another one), and see if that delta's
well?

I'll filter-branch to extern/ only, however the whole FlammableV3 needs to go too, I'm afaid (extern for that project, but internal to $WORK).
I'll do some rewrites and see what comes up.

And if you find some case that doesn't xdelta well, and that you feel
you could make available outside, we could have a test-case...

I'll try with this repo first, if not, I'll see if I can construct one.

Thanks!


--
.marius



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]