Re: [PATCH v3 1/2] pack-bitmap.c: remove unnecessary "open_pack_index()" calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 14, 2022 at 05:03:56PM -0500, Jeff King wrote:
> On Thu, Nov 10, 2022 at 03:10:11PM +0800, Teng Long wrote:
>
> > It's likely that we'll end up relying on objects in that pack later
> > in the process (in which case we're doing the work of opening the
> > pack index optimistically), but not guaranteed.
> >
> > For instance, consider a repository with a large number of small
> > packs, and one large pack with a bitmap. If we see that bitmap pack
> > last in our loop which calls open_pack_bitmap_1(), the current code
> > will have opened *all* pack index files in the repository. If the
> > request can be served out of the bitmapped pack alone, then the time
> > spent opening these idx files was wasted.S
>
> By the way, I wondered if it was possible to measure a slowdown in this
> case. It is, but you have to try pretty hard. Something like this:
>
>   # one bitmapped pack
>   git repack -adb
>
>   # and then a bunch of other packs
>   git rev-list HEAD |
>   head -10000 |
>   while read commit; do
>     echo $commit | git pack-objects .git/objects/pack/pack
>   done

OK, so with 10K packs, we see about a 1.6-fold improvement, which is
definitely substantial.

On a fresh clone of git.git, repeating your experiment with only 1K
packs (which is definitely a number of packs that GitHub sees in
under-maintained repositories), the runtime goes from 25.3ms -> 20.9ms
on my machine, or about a 1.2-fold improvement.

So definitely smaller, but even at 1/10th the number of packs from your
experiment, still noticeable.

>   - this probably isn't helping anybody much in the real world, as
>     evidenced by the contortions I had to go through to set up the
>     situation (and which would be made much better by repacking, which
>     would also speed up non-bitmap operations).

Per above, I'm not sure I totally agree ;-). 1K packs is definitely an
extreme amount of packs, but not out-of-this-world. It probably would
show up in carefully-picked graphs, but not in "overall git rev-list
time" or something as broad/noisy as that.

>   - it's worth doing anyway. Even if it only shaves off microseconds,
>     the existing call is just pointless.

Yes, definitely.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux