On Thu, Aug 8, 2019 at 12:39 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Phil Hord <phil.hord@xxxxxxxxx> writes: > > > From: Phil Hord <phil.hord@xxxxxxxxx> > > > > 'git tag -d' accepts one or more tag refs to delete, but each deletion > > is done by calling `delete_ref` on each argv. This is painfully slow > > when removing from packed refs. Use delete_refs instead so all the > > removals can be done inside a single transaction with a single write. > > > > I have a repo with 24,000 tags, most of which are not useful to any > > developers. Having this many refs slows down many operations that > > would otherwise be very fast. Removing these tags when they've been > > accidentally fetched again takes about 30 minutes using delete_ref. > > > > git tag -l feature/* | xargs git tag -d > > > > Removing the same tags using delete_refs takes less than 5 seconds. > > Makes sense. As mentioned elsewhere in the thread already, > a batched update-ref would open the packed-refs ony once because > everything is done in a single transaction, so presumably a pipeline > like this > > git tag -l feature/* | > sed -e 's|^|delete refs/tags/|' | > git update-ref --stdin > > may work well, and "git tag -d" that gets these refs on the command > line should be capable of doing the same. > > > -static int delete_tag(const char *name, const char *ref, > > - const struct object_id *oid, const void *cb_data) > > +struct tag_args { > > + char *oid_abbrev; > > + char *refname; > > +}; > > + > > +static int make_string_list(const char *name, const char *ref, > > + const struct object_id *oid, void *cb_data) > > Please think about a few more minutes before naming a function like > this, and make it a habit for your future patches. > > We can see that the callback is used to insert more strings into a > string list, but the type (i.e. string_list) used to represent the > set is not all that important. What is more important is why you > are building that set for, and saying what is in the set (as opposed > to saying that the container happens to be a string_list) would be a > good first step. > > I presume that you are enumerating the tags to be deleted, together > with the data necessary for you to report the deletion of the tags? Hm. collect_tags? collect_tags_to_delete? It's true I didn't put enought thought into that. I was experimenting a bit here and was surprised how little code I ended up needing. > > { > > - if (delete_ref(NULL, ref, oid, 0)) > > - return 1; > > - printf(_("Deleted tag '%s' (was %s)\n"), name, > > - find_unique_abbrev(oid, DEFAULT_ABBREV)); > > + struct string_list *ref_list = cb_data; > > + struct tag_args *info = xmalloc(sizeof(struct tag_args)); > > + > > + string_list_append(ref_list, ref); > > + > > + info->oid_abbrev = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV)); > > + info->refname = xstrdup(name); > > + ref_list->items[ref_list->nr - 1].util = info; > > return 0; > > } > > > > +static int delete_tags(const char **argv) > > +{ > > + int result; > > + struct string_list ref_list = STRING_LIST_INIT_DUP; > > + struct string_list_item *ref_list_item; > > + > > + result = for_each_tag_name(argv, make_string_list, (void *) &ref_list); > > + if (!result) > > + result = delete_refs(NULL, &ref_list, REF_NO_DEREF); > > + > > + for_each_string_list_item(ref_list_item, &ref_list) { > > + struct tag_args * info = ref_list_item->util; > > + if (!result) > > + printf(_("Deleted tag '%s' (was %s)\n"), info->refname, > > + info->oid_abbrev); > > + free(info->oid_abbrev); > > + free(info->refname); > > + free(info); > > It is not performance critical, but info->refname is computable from > ref_list_item->string, isn't it? Oh, I guess it is. It's a fixed offset into the string, after all. Thanks. I did look for a way to avoid the struct noise. Just not well. > I am just wondering if we can do > this without having to allocate the .util field for each of 20,000 > tags. We still need to remember oid (or oid_abbrev, but if I were > writing this, I'd record the full oid in .util and make the code > that prints call find_unique_abbrev() on it), so I guess we cannot > really leave .util NULL. My original patch did this (.util = oid). But then I needed a name. I'll go back to keeping the oid. Much cleaner. > > > + } > > + string_list_clear(&ref_list, 0); > > + return result; > > We used to return the returned value from for_each_tag_name() that > repeatedly called delete_tag(). > > Now we return value from delete_refs(). Are our caller(s) OK with > the values that may come back from that function? Can delete_refs() > return a value that is not appropriate to be returned from > cmd_tag(), for example a negative value? Yes it does. Will fix. > > > +} > > + > > static int verify_tag(const char *name, const char *ref, > > - const struct object_id *oid, const void *cb_data) > > + const struct object_id *oid, void *cb_data) > > { > > int flags; > > const struct ref_format *format = cb_data; > > @@ -511,7 +543,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix) > > if (filter.merge_commit) > > die(_("--merged and --no-merged options are only allowed in list mode")); > > if (cmdmode == 'd') > > - return for_each_tag_name(argv, delete_tag, NULL); > > + return delete_tags(argv); > > Thanks.