On Mon, Nov 14, 2022 at 05:03:56PM -0500, Jeff King wrote: > On Thu, Nov 10, 2022 at 03:10:11PM +0800, Teng Long wrote: > > > It's likely that we'll end up relying on objects in that pack later > > in the process (in which case we're doing the work of opening the > > pack index optimistically), but not guaranteed. > > > > For instance, consider a repository with a large number of small > > packs, and one large pack with a bitmap. If we see that bitmap pack > > last in our loop which calls open_pack_bitmap_1(), the current code > > will have opened *all* pack index files in the repository. If the > > request can be served out of the bitmapped pack alone, then the time > > spent opening these idx files was wasted.S > > By the way, I wondered if it was possible to measure a slowdown in this > case. It is, but you have to try pretty hard. Something like this: > > # one bitmapped pack > git repack -adb > > # and then a bunch of other packs > git rev-list HEAD | > head -10000 | > while read commit; do > echo $commit | git pack-objects .git/objects/pack/pack > done OK, so with 10K packs, we see about a 1.6-fold improvement, which is definitely substantial. On a fresh clone of git.git, repeating your experiment with only 1K packs (which is definitely a number of packs that GitHub sees in under-maintained repositories), the runtime goes from 25.3ms -> 20.9ms on my machine, or about a 1.2-fold improvement. So definitely smaller, but even at 1/10th the number of packs from your experiment, still noticeable. > - this probably isn't helping anybody much in the real world, as > evidenced by the contortions I had to go through to set up the > situation (and which would be made much better by repacking, which > would also speed up non-bitmap operations). Per above, I'm not sure I totally agree ;-). 1K packs is definitely an extreme amount of packs, but not out-of-this-world. It probably would show up in carefully-picked graphs, but not in "overall git rev-list time" or something as broad/noisy as that. > - it's worth doing anyway. Even if it only shaves off microseconds, > the existing call is just pointless. Yes, definitely. Thanks, Taylor