On Fri, Sep 06, 2019 at 01:10:46AM +0200, Stephan Beyer wrote: > I took a quick glance at yours. I also noticed the issue you address in > [PATCH 2/6], but I was unsure if this is the way to go (I'm only > occasionally reading on this list). I would prefer your patch series, > with maybe one exception... Yeah, we've been slowly removing these old "unsigned char *" references (see a7db4c193d98f for the latest round touching this same code -- I'm actually surprised I missed this one back then, as that's when packlist_find() got converted). > The thing is: I had *exactly* the same commit like your [PATCH 6/6] > (except for the commit message and for the number), but I dropped it. > Why? Because I had the feeling (no particular instance though) that the > second locate_object_entry_hash() for each insertion *can* indeed take > "too much" time. Also, I was wondering, if the "found = 1" case should > be catched as a BUG("should not happen") or something. > > I don't care much, though. The performance impact should probably be > checked carefully. I did measure it, like this: # Do the traversal separately. This would make any difference in the # pack-objects hash code stand out _more_, plus it makes it cheaper to # run multiple trials. git rev-list --objects --all >input # Make sure stderr is redirected to avoid progress, which again would # amplify any differences. git pack-objects --stdout --delta-base-offset <input >/dev/null 2>&1 Running on linux.git, I got: [before] Benchmark #1: git pack-objects --stdout --delta-base-offset <input >/dev/null 2>&1 Time (mean ± σ): 26.225 s ± 0.233 s [User: 24.089 s, System: 4.867 s] Range (min … max): 25.915 s … 26.555 s 10 runs [after] Benchmark #1: git pack-objects --stdout --delta-base-offset <input >/dev/null 2>&1 Time (mean ± σ): 26.129 s ± 0.170 s [User: 24.003 s, System: 4.958 s] Range (min … max): 25.974 s … 26.570 s 10 runs So actually faster after, though not statistically significant. ;) The BUG() on "found==1" might be worth doing. -Peff