Re: [PATCH 2/2] ls-refs.c: traverse longest common ref prefix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 19, 2021 at 7:19 PM Taylor Blau <me@xxxxxxxxxxxx> wrote:
>
> ls-refs performs a single revision walk over the whole ref namespace,
> and sends ones that match with one of the given ref prefixes down to the
> user.
>
> This can be expensive if there are many refs overall, but the portion of
> them covered by the given prefixes is small by comparison.
>
> To attempt to reduce the difference between the number of refs
> traversed, and the number of refs sent, only traverse references which
> are in the longest common prefix of the given prefixes. This is very
> reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
> disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
> multi-patterned 'git for-each-ref' invocations.
>
> The only difference here is that we are operating on ref prefixes, which
> do not necessarily point to a single reference. That is just fine, since
> all we care about is finding the longest common prefix among prefixes
> which can be thought of as refspecs for our purposes here.
>
> Similarly, for_each_fullref_in_prefixes may return more results than the
> caller asked for (since the longest common prefix might match something
> that a longer prefix in the same set wouldn't match) but
> ls-refs.c:send_ref() discards such results.
>
> The code introduced in b31e2680c4 is resilient to stop early (and
> return a shorter prefix) when it encounters a metacharacter (as
> mentioned in that patch, there is some opportunity to improve this, but
> nobody has done it).
>
> There are two remaining small items in this patch:
>
>   - If no prefixes were provided, then implicitly add the empty string
>     (which will match all references).
>
>   - Since we are manually munging the prefixes, make sure that we
>     initialize it ourselves (previously this wasn't necessary since the
>     first strvec_push would do so).
>
> Original-patch-by: Jacob Vosmaer <jacob@xxxxxxxxxx>
> Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> ---
>  ls-refs.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..eaaa36d0df 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -90,6 +90,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>         struct ls_refs_data data;
>
>         memset(&data, 0, sizeof(data));
> +       strvec_init(&data.prefixes);
>
>         git_config(ls_refs_config, NULL);
>
> @@ -109,7 +110,10 @@ int ls_refs(struct repository *r, struct strvec *keys,
>                 die(_("expected flush after ls-refs arguments"));
>
>         head_ref_namespaced(send_ref, &data);
> -       for_each_namespaced_ref(send_ref, &data);
> +       if (!data.prefixes.nr)
> +               strvec_push(&data.prefixes, "");

The old code, with for_each_namespaced_ref, would walk
"${NAMESPACE}refs/". The new code would walk "${NAMESPACE}" because
we're pushing "" onto data.prefixes. So if there is anything in the
namespace that does not start with "refs/" it will get yielded.

Does that matter?

> +       for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
> +                                    send_ref, &data, 0);
>         packet_flush(1);
>         strvec_clear(&data.prefixes);
>         return 0;
> --
> 2.30.0.138.g6d7191ea01



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux