On 3/5/2017 19:14, Linus Torvalds wrote:
On Sat, Mar 4, 2017 at 12:27 AM, Marius Storm-Olsen <mstormo@xxxxxxxxx> wrote: I guess you could do the printout a bit earlier (on the "to_pack.objects[]" array - to_pack.nr_objects is the count there). That should show all of them. But the small objects shouldn't matter. But if you have a file like extern/win/FlammableV3/x64/lib/FlameProxyLibD.lib I would have assumed that it has a size that is > 50. Unless those "extern" things are placeholders?
No placeholders, the FlameProxyLibD.lib is a debug lib, and probably the largest in the whole repo (with a replace count > 5).
I do wonder if your dll data just simply is absolutely horrible for xdelta. We've also limited the delta finding a bit, simply because it had some O(m*n) behavior that gets very expensive on some patterns. Maybe your blobs trigger some of those case.
Ok, but given that the SVN delta compression, which forward-linear only, is ~45% better, perhaps that particular search could be done fairly cheap? Although, I bet time(stamps) are out of the loop at that point, so it's not a factor anymore. Even if it where, I'm not sure it would solve anything, if there's other factors also limiting deltafication.
The diff-delta work all goes back to 2005 and 2006, so it's a long time ago. What I'd ask you to do is try to find if you could make a reposity of just one of the bigger DLL's with its history, particularly if you can find some that you don't think is _that_ sensitive. Looking at it, for example, I see that you have that file extern/redhat-5/FlammableV3/x64/plugins/libFlameCUDA-3.0.703.so that seems to have changed several times, and is a largish blob. Could you try creating a repository with git fast-import that *only* contains that file (or pick another one), and see if that delta's well?
I'll filter-branch to extern/ only, however the whole FlammableV3 needs to go too, I'm afaid (extern for that project, but internal to $WORK).
I'll do some rewrites and see what comes up.
And if you find some case that doesn't xdelta well, and that you feel you could make available outside, we could have a test-case...
I'll try with this repo first, if not, I'll see if I can construct one. Thanks! -- .marius