In git-pack-objects, we iterate over all the tags if the --include-tag option is passed on the command line. For some reason this uses for_each_ref which is expensive if the repo has many refs. We should use for_each_tag_ref instead. Because the add_ref_tag callback will now only visit tags we simplified it a bit. The motivation for this change is that we observed performance issues with a repository on gitlab.com that has 500,000 refs but only 2,000 tags. The fetch traffic on that repo is dominated by CI, and when we changed CI to fetch with 'git fetch --no-tags' we saw a dramatic change in the CPU profile of git-pack-objects. This lead us to this particular ref walk. More details in: https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_483546598 Signed-off-by: Jacob Vosmaer <jacob@xxxxxxxxxx> --- builtin/pack-objects.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 2a00358f34..ad52c91bdb 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2803,13 +2803,11 @@ static void add_tag_chain(const struct object_id *oid) } } -static int add_ref_tag(const char *path, const struct object_id *oid, int flag, void *cb_data) +static int add_ref_tag(const char *tag, const struct object_id *oid, int flag, void *cb_data) { struct object_id peeled; - if (starts_with(path, "refs/tags/") && /* is a tag? */ - !peel_ref(path, &peeled) && /* peelable? */ - obj_is_packed(&peeled)) /* object packed? */ + if (!peel_ref(tag, &peeled) && obj_is_packed(&peeled)) add_tag_chain(oid); return 0; } @@ -3740,7 +3738,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) } cleanup_preferred_base(); if (include_tag && nr_result) - for_each_ref(add_ref_tag, NULL); + for_each_tag_ref(add_ref_tag, NULL); stop_progress(&progress_state); trace2_region_leave("pack-objects", "enumerate-objects", the_repository); -- 2.30.0