On Fri, Jan 28, 2011 at 17:32, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: > > I fully implemented the reuse of a cached pack behind a thin pack idea > I was trying to describe in this thread. It saved 1m7s off the JGit > running time, but increased the data transfer by 25 MiB. I didn't > expect this much of an increase, I honestly expected the thin pack > portion to be well, thinner. JGit's thin pack creation is crap. For example, this is the same fetch: $ git fetch ../tmp_linux26 remote: Counting objects: 61521, done. remote: Compressing objects: 100% (12096/12096), done. remote: Total 50275 (delta 42578), reused 45220 (delta 37524) Receiving objects: 100% (50275/50275), 11.13 MiB | 7.29 MiB/s, done. Resolving deltas: 100% (42578/42578), completed with 4968 local objects. $ git fetch git://localhost/tmp_linux26 remote: Counting objects: 144190, done remote: Finding sources: 100% (50275/50275) remote: Compressing objects: 100% (106568/106568) remote: Compressing objects: 100% (12750/12750) Receiving objects: 100% (50275/50275), 24.66 MiB | 10.93 MiB/s, done. Resolving deltas: 100% (40345/40345), completed with 2218 local objects. JGit produced an extra 13.53 MiB for this pack, because it missed about 2,233 delta opportunities. It turns out we are too aggressive at pushing objects from the edges into the delta windows. JGit pushes *everything* in the edge commits, rather than only the paths that are actually used by the objects we need to send. This floods the delta search window with garbage, and makes it less likely that an object to be sent will find a relevant delta base in the search window. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html