On Wed, Jul 18, 2018 at 10:41 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > On Thu, Jul 19, 2018 at 12:51 AM Elijah Newren <newren@xxxxxxxxx> wrote: >> >> I had a user report some poor behavior of 'git gc --aggressive' on a >> certain repo (which I sadly cannot share). Turns out that on this >> repo, this operation takes about 60% longer and produces a pack >> roughly twice the expected size. > > The intention was to make life better for weaker machines but > definitely should not slow down beefier ones, so yes this is > definitely a regression. Not sure if it matters, but the original discovery was on a laptop. Probably 4 cores and 16 GB RAM. I duplicated on my workstation (8 cores, 24 GB RAM). However, since the original patch was memory related and I noticed the repacking using all available memory, I repeated the testcases while measuring memory but did it on a machine that wouldn't be memory limited. > Is it possible to share "verify-pack -v <pack file>" output of the > pack produced by 2.17.0 and 2.18.0? The only sensitive info there is > sha-1, which you can replace with just "SHA-1" if you want. I'm more > interested in delta sizes and distribution. For the deltas, assuming the size-in-pack field (4th column) is the relevant one Number of objects in repo (lines of verify-pack output): 4460755 Number of deltas: 2831384 Number of deltas greater than 1MB: 72 Min: 14 Median: 47 Mean: 586 99.999 percentile: 11366280.6 (10.8 MB) Max: 141664210 (135.1 MB) If the size field (3rd column) is the relevant one, then the numbers change slightly to: Number of deltas greater than 1MB: 101 Min: 4 Median: 33 Mean: 660 99.999 percentile: 12245551.7 (11.7 MB) Max: 144658342 (138.0 MB) I checked out the blob which had the biggest delta, as well as the blob it was a delta against. One was a 280 MB file, the other a 278 MB file.