On Tue, Oct 08, 2024 at 04:54:23AM -0400, Jeff King wrote: > On Mon, Oct 07, 2024 at 11:41:16AM +0200, Patrick Steinhardt wrote: > > > > Hmm. I think I am probably splitting hairs, since a few xstrdup() calls > > > between friends is unlikely to matter here, but... ;-) > > > > > > It does seem wasteful if we can avoid it. We already allocated heap > > > space for the matches via the underlying strbuf, and we really do want > > > to hand ownership over if we can. > > > > > > Would doing something like this on top of your previous patch do the > > > trick? > > > > > > --- >8 --- > > > diff --git a/pseudo-merge.c b/pseudo-merge.c > > > index 28782a31c6..6b6605d3dc 100644 > > > --- a/pseudo-merge.c > > > +++ b/pseudo-merge.c > > > @@ -110,6 +110,7 @@ void pseudo_merge_group_release(struct pseudo_merge_group *group) > > > free(matches->stable); > > > free(matches->unstable); > > > free(matches); > > > + free((char*)e->key); > > > } > > > strmap_clear(&group->matches, 0); > > > --- 8< --- > > > > > > That introduces an ugly const-cast, but I think it's tolerable (and may > > > be moreso with a comment explaining why it's safe). > > > > Well, to me this is a tradeoff between complexity and performance. If > > the performance benefit outweighs the additional complexity that this > > shared ownership of keys brings along with it then I'm okay with the > > code being intimate with each others lifetime requirements. > > > > But as far as I can see the number of entries is determined by the group > > pattern, and I expect that in most cases it's going to be quite limited. > > So it does not feel like this should lead to all that many extra > > allocations, and thus I tend to prefer the simpler solution. > > I'd tend to agree with you that the allocations aren't a big deal here. > But I think we've run into this kind of strbuf-detaching thing before, > and there's another pattern that is efficient without getting too > intimate between modules. Like this (plus your change to set the > strmap's strdup_strings mode): > > diff --git a/pseudo-merge.c b/pseudo-merge.c > index 10ebd9a4e9..fb1164edfa 100644 > --- a/pseudo-merge.c > +++ b/pseudo-merge.c > @@ -210,6 +210,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname, > struct bitmap_writer *writer = _data; > struct object_id peeled; > struct commit *c; > + struct strbuf group_name = STRBUF_INIT; > uint32_t i; > int has_bitmap; > > @@ -227,10 +228,11 @@ static int find_pseudo_merge_group_for_ref(const char *refname, > for (i = 0; i < writer->pseudo_merge_groups.nr; i++) { > struct pseudo_merge_group *group; > struct pseudo_merge_matches *matches; > - struct strbuf group_name = STRBUF_INIT; > regmatch_t captures[16]; > size_t j; > > + strbuf_reset(&group_name); > + > group = writer->pseudo_merge_groups.items[i].util; > if (regexec(group->pattern, refname, ARRAY_SIZE(captures), > captures, 0)) > @@ -256,8 +258,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname, > matches = strmap_get(&group->matches, group_name.buf); > if (!matches) { > matches = xcalloc(1, sizeof(*matches)); > - strmap_put(&group->matches, strbuf_detach(&group_name, NULL), > - matches); > + strmap_put(&group->matches, group_name.buf, matches); > } > > if (c->date <= group->stable_threshold) { > @@ -270,9 +271,9 @@ static int find_pseudo_merge_group_for_ref(const char *refname, > matches->unstable[matches->unstable_nr++] = c; > } > > - strbuf_release(&group_name); > } > > + strbuf_release(&group_name); > return 0; > } > > > Now we skip the duplicate allocations because we are reusing the > group_name scratch buffer in the loop over and over. But wait, there's > more! We're actually _more_ efficient than the original which did one > allocation per entry, because: > > 1. We can allocate the correct number of bytes for each name, rather > than using the over-estimated buffer made by strbuf. > > 2. In strdup_strings mode, strmap is smart enough to use a FLEXPTR to > stick the name inside the struct. So we've actually reduced the > number of allocations per entry by 1. > > Now possibly it is not even worth doing this optimization, because this > is not really a hot path. But it doesn't violate any module boundaries, > and I think the "loop over a reusable strbuf" trick is a general idiom > for solving these kinds of allocation problems. So a good thing to keep > in our toolbox. Nice, thanks for digging! I see that Junio has already merged the topic to `next` though -- is this a mere "Let this cook before the next cycle starts" or will it stay in next? If the latter then I'll leave this as-is, otherwise I'll send a revised version to make this a bit more efficient. Thanks! Patrick