On Monday, August 05, 2013 08:38:47 pm Ramkumar Ramachandra wrote: > This is the rough explanation I wrote down after reading > it: > > So, the problem is that my .git/objects/pack is polluted > with little packs everytime I fetch (or push, if you're > the server), and this is problematic from the > perspective of a overtly (naively) aggressive gc that > hammers out all fragmentation. So, on the first run, > the little packfiles I have are all "consolidated" into > big packfiles; you also write .keep files to say that > "don't gc these big packs we just generated". In > subsequent runs, the little packfiles from the fetch are > absorbed into a pack that is immune to gc. You're also > using a size heuristic, to consolidate similarly sized > packfiles. You also have a --ratio to tweak the ratio > of sizes. > > From: Martin Fick<mfick@xxxxxxxxxxxxxx> > See: https://gerrit-review.googlesource.com/#/c/35215/ > Thread: > http://thread.gmane.org/gmane.comp.version-control.git/2 > 31555 (Martin's emails are missing from the archive) > --- After analyzing today's data, I recognize that in some circumstances the size estimation after consolidation can be off by huge amounts. The script naively just adds the current sizes together. This gives a very rough estimate, of the new packfile size, but sometimes it can be off by over 2 orders of magnitude. :( While many new packfiles are tiny (several K only), it seems like the larger new packfiles have a terrible tendency to throw the estimate way off (I suspect they simply have many duplicate objects). But despite this poor estimate, the script still offers drastic improvements over plain git gc. So, it has me wondering if there isn't a more accurate way to estimate the new packfile without wasting a ton of time? If not, one approach which might be worth experimenting with is to just assume that new packfiles have size 0! Then just consolidate them with any other packfile which is ready for consolidation, or if none are ready, with the smallest packfile. I would not be surprised to see this work on average better than the current summation, -Martin -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html