Re: [PATCH 1/1] builtin/pack-objects.c: avoid iterating all refs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 19 2021, Jacob Vosmaer wrote:

> In git-pack-objects, we iterate over all the tags if the --include-tag
> option is passed on the command line. For some reason this uses
> for_each_ref which is expensive if the repo has many refs. We should
> use a prefix iterator instead.
>
> Signed-off-by: Jacob Vosmaer <jacob@xxxxxxxxxx>
> ---
>  builtin/pack-objects.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 2a00358f34..2b32bc93bd 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -3740,7 +3740,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>  	}
>  	cleanup_preferred_base();
>  	if (include_tag && nr_result)
> -		for_each_ref(add_ref_tag, NULL);
> +		for_each_fullref_in("refs/tags/", add_ref_tag, NULL, 0);
>  	stop_progress(&progress_state);
>  	trace2_region_leave("pack-objects", "enumerate-objects",
>  			    the_repository);

Not really a new problem, but it would be nice to also have a test in
t5305-include-tag.sh to check what happens with "tags" outside the
refs/tags/ namespace. I.e. if this & the existing prefix in
add_ref_tag() were dropped.

To elaborate on other things that aren't really your problem & Taylor's
E-Mail downthread we originally added this because:

    If new annotated tags have been introduced then we can also include
    them in the packfile, saving the client from needing to request them
    through a second connection.

I've barked up this whole tag fetch tree before 97716d217c (fetch: add a
--prune-tags option and fetch.pruneTags config, 2018-02-09) but really
not this specific area.

I wonder if longer term simply moving this to be work the client does
wouldn't make more sense. I.e. if we just delete this for_each_ref()
loop.
    
We're just saving a client from saying they "want" a tag. I.e. with the
whole thing removed we need this to make t5503-tagfollow.sh pass (see
[1] at the end for the dialog, the tag is 3253df4d...):

    diff --git a/t/t5503-tagfollow.sh b/t/t5503-tagfollow.sh
    index 6041a4dd32..1ddc430aef 100755
    --- a/t/t5503-tagfollow.sh
    +++ b/t/t5503-tagfollow.sh
    @@ -134,6 +134,7 @@ test_expect_success 'fetch B, S (commit and tag : 1 connection)' '
     test_expect_success 'setup expect' '
     cat - <<EOF >expect
     want $B
    +want $T
     want $S
     EOF
     '
    @@ -146,6 +147,7 @@ test_expect_success 'new clone fetch master and tags' '
                    cd clone2 &&
                    git init &&
                    git remote add origin .. &&
    +               git config --add remote.origin.fetch "+refs/tags/*:refs/tags/*" &&
                    GIT_TRACE_PACKET=$UPATH git fetch &&
                    test $B = $(git rev-parse --verify origin/master) &&
                    test $S = $(git rev-parse --verify tag2) &&

We're also saving the client the work of having to go through
refs/tags/* and figure out whether there are tags there that aren't on
our main history.

I think that since f0a24aa56e (git-pack-objects: Automatically pack
annotated tags if object was packed, 2008-03-03) was written it's become
clear that in the wild that's almost nobody's use-case. I.e. people with
a set of refs/heads/* branches and a refs/tags/* set that doesn't refer
to those branches.

Or if they do, I think it's such an obscure use-case that if we were
designing this today we could pass that work of figuring out if there's
such tags in refs/tags/* off to the client.

It seems we might just be able to delete this code on the server-side,
per protocol-capabilities.txt:

    Clients MUST be prepared for the case where a server has ignored
    include-tag and has not actually sent tags in the pack.  In such
    cases the client SHOULD issue a subsequent fetch to acquire the tags
    that include-tag would have otherwise given the client.

I.e. in the case where the server isn't playing along and I haven't set
"+refs/tags/*:refs/tags/*". But as the test shows we don't do that
following ourselves unless refs/tags/* is in the refspec (and then it's
not really "following", we're just getting all the tags).

1. From t5503-tagfollow.sh:
    $ grep -C5 3253df4d1cf6fb138b52b1938473bcfec1483223 UPLOAD_LOG 
    packet:  upload-pack< ref-prefix refs/tags/
    packet:  upload-pack< ref-prefix refs/tags/
    packet:  upload-pack< 0000
    packet:  upload-pack> 8e10cf4e007ad7e003463c30c34b1050b039db78 refs/heads/master
    packet:        fetch< 8e10cf4e007ad7e003463c30c34b1050b039db78 refs/heads/master
    packet:  upload-pack> 3253df4d1cf6fb138b52b1938473bcfec1483223 refs/tags/tag1 peeled:c06aaaf9c64f9ef1f209bcb6d7bdba68a5e27fab
    packet:        fetch< 3253df4d1cf6fb138b52b1938473bcfec1483223 refs/tags/tag1 peeled:c06aaaf9c64f9ef1f209bcb6d7bdba68a5e27fab
    packet:  upload-pack> ddfa4a33562179aca1ace2bcc662244a17d0b503 refs/tags/tag2 peeled:8e10cf4e007ad7e003463c30c34b1050b039db78
    packet:  upload-pack> 0000
    packet:        fetch< ddfa4a33562179aca1ace2bcc662244a17d0b503 refs/tags/tag2 peeled:8e10cf4e007ad7e003463c30c34b1050b039db78
    packet:        fetch< 0000
    packet:        fetch> command=fetch
    --
    packet:        fetch> 0001
    packet:        fetch> thin-pack
    packet:        fetch> include-tag
    packet:        fetch> ofs-delta
    packet:        fetch> want 8e10cf4e007ad7e003463c30c34b1050b039db78
    packet:        fetch> want 3253df4d1cf6fb138b52b1938473bcfec1483223
    packet:        fetch> want ddfa4a33562179aca1ace2bcc662244a17d0b503
    packet:        fetch> done
    packet:        fetch> 0000
    packet:  upload-pack< object-format=sha1
    packet:  upload-pack< command=fetch
    --
    packet:  upload-pack< 0001
    packet:  upload-pack< thin-pack
    packet:  upload-pack< include-tag
    packet:  upload-pack< ofs-delta
    packet:  upload-pack< want 8e10cf4e007ad7e003463c30c34b1050b039db78
    packet:  upload-pack< want 3253df4d1cf6fb138b52b1938473bcfec1483223
    packet:  upload-pack< want ddfa4a33562179aca1ace2bcc662244a17d0b503
    packet:  upload-pack< done
    packet:  upload-pack< 0000
    packet:  upload-pack> packfile
    packet:        fetch< packfile
    



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux