Re: [PATCH v5 17/27] revisions API: have release_revisions() release "mailmap"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ævar

On 02/04/2022 11:49, Ævar Arnfjörð Bjarmason wrote:
Extend the the release_revisions() function so that it frees the
"mailmap" in the "struct rev_info".

The log family of functions now calls the clear_mailmap() function
added in fa8afd18e5a (revisions API: provide and use a
release_revisions(), 2021-09-19), allowing us to whitelist some tests
with "TEST_PASSES_SANITIZE_LEAK=true".

Unfortunately having a pointer to a mailmap in "struct rev_info"
instead of an embedded member that we "own" get a bit messy, as can be
seen in the change to builtin/commit.c.

When we free() this data we won't be able to tell apart a pointer to a
"mailmap" on the heap from one on the stack. As seen in
ea57bc0d41b (log: add --use-mailmap option, 2013-01-05) the "log"
family allocates it on the heap, but in the find_author_by_nickname()
code added in ea16794e430 (commit: search author pattern against
mailmap, 2013-08-23) we allocated it on the stack instead.

Ideally we'd simply change that member to a "struct string_list
mailmap" and never free() the "mailmap" itself, but that would be a
much larger change to the revisions API.

I agree it makes sense to leave that for now

We have code that needs to hand an existing "mailmap" to a "struct
rev_info", while we could change all of that, let's not go there
now.

The complexity isn't in the ownership of the "mailmap" per-se, but
that various things assume a "rev_info.mailmap == NULL" means "doesn't
want mailmap", if we changed that to an init'd "struct string_list
we'd need to carefully refactor things to change those assumptions.

Let's instead always free() it, and simply declare that if you add
such a "mailmap" it must be allocated on the heap. Any modern libc
will correctly panic if we free() a stack variable, so this should be
safe going forward.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
---
  builtin/commit.c                   | 5 ++---
  revision.c                         | 9 +++++++++
  t/t0056-git-C.sh                   | 1 +
  t/t3302-notes-index-expensive.sh   | 1 +
  t/t4055-diff-context.sh            | 1 +
  t/t4066-diff-emit-delay.sh         | 1 +
  t/t7008-filter-branch-null-sha1.sh | 1 +
  7 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index c7eda9bbb72..cd6cebcf8c8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1100,7 +1100,6 @@ static const char *find_author_by_nickname(const char *name)
  	struct rev_info revs;
  	struct commit *commit;
  	struct strbuf buf = STRBUF_INIT;
-	struct string_list mailmap = STRING_LIST_INIT_NODUP;
  	const char *av[20];
  	int ac = 0;
@@ -1111,7 +1110,8 @@ static const char *find_author_by_nickname(const char *name)
  	av[++ac] = buf.buf;
  	av[++ac] = NULL;
  	setup_revisions(ac, av, &revs, NULL);
-	revs.mailmap = &mailmap;
+	revs.mailmap = xmalloc(sizeof(struct string_list));
+	string_list_init_nodup(revs.mailmap);

This is a common pattern in one of the previous patches, is it worth adding helpers to allocate and initialize a struct string_list? Maybe string_list_new_nodup() and string_list_new_dup().

  	read_mailmap(revs.mailmap);
if (prepare_revision_walk(&revs))
@@ -1122,7 +1122,6 @@ static const char *find_author_by_nickname(const char *name)
  		ctx.date_mode.type = DATE_NORMAL;
  		strbuf_release(&buf);
  		format_commit_message(commit, "%aN <%aE>", &buf, &ctx);
-		clear_mailmap(&mailmap);
  		release_revisions(&revs);
  		return strbuf_detach(&buf, NULL);
  	}
diff --git a/revision.c b/revision.c
index 553f7de8250..622f0faecc4 100644
--- a/revision.c
+++ b/revision.c
@@ -2926,10 +2926,19 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
  	return left;
  }
+static void release_revisions_mailmap(struct string_list *mailmap)
+{
+	if (!mailmap)
+		return;
+	clear_mailmap(mailmap);
+	free(mailmap);
+}

It's not a big issue but if there are no other users of this then it could just go inside release_revisions, my impression is that this series builds a collection of very small functions whose only caller is release_revisions()

Best Wishes

Phillip



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux