Hi Ævar
On 02/04/2022 11:49, Ævar Arnfjörð Bjarmason wrote:
Extend the the release_revisions() function so that it frees the
"mailmap" in the "struct rev_info".
The log family of functions now calls the clear_mailmap() function
added in fa8afd18e5a (revisions API: provide and use a
release_revisions(), 2021-09-19), allowing us to whitelist some tests
with "TEST_PASSES_SANITIZE_LEAK=true".
Unfortunately having a pointer to a mailmap in "struct rev_info"
instead of an embedded member that we "own" get a bit messy, as can be
seen in the change to builtin/commit.c.
When we free() this data we won't be able to tell apart a pointer to a
"mailmap" on the heap from one on the stack. As seen in
ea57bc0d41b (log: add --use-mailmap option, 2013-01-05) the "log"
family allocates it on the heap, but in the find_author_by_nickname()
code added in ea16794e430 (commit: search author pattern against
mailmap, 2013-08-23) we allocated it on the stack instead.
Ideally we'd simply change that member to a "struct string_list
mailmap" and never free() the "mailmap" itself, but that would be a
much larger change to the revisions API.
I agree it makes sense to leave that for now
We have code that needs to hand an existing "mailmap" to a "struct
rev_info", while we could change all of that, let's not go there
now.
The complexity isn't in the ownership of the "mailmap" per-se, but
that various things assume a "rev_info.mailmap == NULL" means "doesn't
want mailmap", if we changed that to an init'd "struct string_list
we'd need to carefully refactor things to change those assumptions.
Let's instead always free() it, and simply declare that if you add
such a "mailmap" it must be allocated on the heap. Any modern libc
will correctly panic if we free() a stack variable, so this should be
safe going forward.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
---
builtin/commit.c | 5 ++---
revision.c | 9 +++++++++
t/t0056-git-C.sh | 1 +
t/t3302-notes-index-expensive.sh | 1 +
t/t4055-diff-context.sh | 1 +
t/t4066-diff-emit-delay.sh | 1 +
t/t7008-filter-branch-null-sha1.sh | 1 +
7 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index c7eda9bbb72..cd6cebcf8c8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1100,7 +1100,6 @@ static const char *find_author_by_nickname(const char *name)
struct rev_info revs;
struct commit *commit;
struct strbuf buf = STRBUF_INIT;
- struct string_list mailmap = STRING_LIST_INIT_NODUP;
const char *av[20];
int ac = 0;
@@ -1111,7 +1110,8 @@ static const char *find_author_by_nickname(const char *name)
av[++ac] = buf.buf;
av[++ac] = NULL;
setup_revisions(ac, av, &revs, NULL);
- revs.mailmap = &mailmap;
+ revs.mailmap = xmalloc(sizeof(struct string_list));
+ string_list_init_nodup(revs.mailmap);
This is a common pattern in one of the previous patches, is it worth
adding helpers to allocate and initialize a struct string_list? Maybe
string_list_new_nodup() and string_list_new_dup().
read_mailmap(revs.mailmap);
if (prepare_revision_walk(&revs))
@@ -1122,7 +1122,6 @@ static const char *find_author_by_nickname(const char *name)
ctx.date_mode.type = DATE_NORMAL;
strbuf_release(&buf);
format_commit_message(commit, "%aN <%aE>", &buf, &ctx);
- clear_mailmap(&mailmap);
release_revisions(&revs);
return strbuf_detach(&buf, NULL);
}
diff --git a/revision.c b/revision.c
index 553f7de8250..622f0faecc4 100644
--- a/revision.c
+++ b/revision.c
@@ -2926,10 +2926,19 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
return left;
}
+static void release_revisions_mailmap(struct string_list *mailmap)
+{
+ if (!mailmap)
+ return;
+ clear_mailmap(mailmap);
+ free(mailmap);
+}
It's not a big issue but if there are no other users of this then it
could just go inside release_revisions, my impression is that this
series builds a collection of very small functions whose only caller is
release_revisions()
Best Wishes
Phillip