Re: [PATCH v5 17/27] revisions API: have release_revisions() release "mailmap"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 03 2022, Phillip Wood wrote:

> Hi Ævar
>
> On 02/04/2022 11:49, Ævar Arnfjörð Bjarmason wrote:
>> Extend the the release_revisions() function so that it frees the
>> "mailmap" in the "struct rev_info".
>> The log family of functions now calls the clear_mailmap() function
>> added in fa8afd18e5a (revisions API: provide and use a
>> release_revisions(), 2021-09-19), allowing us to whitelist some tests
>> with "TEST_PASSES_SANITIZE_LEAK=true".
>> Unfortunately having a pointer to a mailmap in "struct rev_info"
>> instead of an embedded member that we "own" get a bit messy, as can be
>> seen in the change to builtin/commit.c.
>> When we free() this data we won't be able to tell apart a pointer to
>> a
>> "mailmap" on the heap from one on the stack. As seen in
>> ea57bc0d41b (log: add --use-mailmap option, 2013-01-05) the "log"
>> family allocates it on the heap, but in the find_author_by_nickname()
>> code added in ea16794e430 (commit: search author pattern against
>> mailmap, 2013-08-23) we allocated it on the stack instead.
>> Ideally we'd simply change that member to a "struct string_list
>> mailmap" and never free() the "mailmap" itself, but that would be a
>> much larger change to the revisions API.
>
> I agree it makes sense to leave that for now
>
>> We have code that needs to hand an existing "mailmap" to a "struct
>> rev_info", while we could change all of that, let's not go there
>> now.
>> The complexity isn't in the ownership of the "mailmap" per-se, but
>> that various things assume a "rev_info.mailmap == NULL" means "doesn't
>> want mailmap", if we changed that to an init'd "struct string_list
>> we'd need to carefully refactor things to change those assumptions.
>> Let's instead always free() it, and simply declare that if you add
>> such a "mailmap" it must be allocated on the heap. Any modern libc
>> will correctly panic if we free() a stack variable, so this should be
>> safe going forward.
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
>> ---
>>   builtin/commit.c                   | 5 ++---
>>   revision.c                         | 9 +++++++++
>>   t/t0056-git-C.sh                   | 1 +
>>   t/t3302-notes-index-expensive.sh   | 1 +
>>   t/t4055-diff-context.sh            | 1 +
>>   t/t4066-diff-emit-delay.sh         | 1 +
>>   t/t7008-filter-branch-null-sha1.sh | 1 +
>>   7 files changed, 16 insertions(+), 3 deletions(-)
>> diff --git a/builtin/commit.c b/builtin/commit.c
>> index c7eda9bbb72..cd6cebcf8c8 100644
>> --- a/builtin/commit.c
>> +++ b/builtin/commit.c
>> @@ -1100,7 +1100,6 @@ static const char *find_author_by_nickname(const char *name)
>>   	struct rev_info revs;
>>   	struct commit *commit;
>>   	struct strbuf buf = STRBUF_INIT;
>> -	struct string_list mailmap = STRING_LIST_INIT_NODUP;
>>   	const char *av[20];
>>   	int ac = 0;
>>   @@ -1111,7 +1110,8 @@ static const char
>> *find_author_by_nickname(const char *name)
>>   	av[++ac] = buf.buf;
>>   	av[++ac] = NULL;
>>   	setup_revisions(ac, av, &revs, NULL);
>> -	revs.mailmap = &mailmap;
>> +	revs.mailmap = xmalloc(sizeof(struct string_list));
>> +	string_list_init_nodup(revs.mailmap);
>
> This is a common pattern in one of the previous patches, is it worth
> adding helpers to allocate and initialize a struct string_list? Maybe 
> string_list_new_nodup() and string_list_new_dup().

Maybe, but generally in the git codebase things malloc and then init(),
if we're going to add something like this *_new() that would be a change
for a lot more APIs than just mailmap.

And if it's just for mailmap I don't see how the inconsistency with
other code would be worth it.

>>   	read_mailmap(revs.mailmap);
>>     	if (prepare_revision_walk(&revs))
>> @@ -1122,7 +1122,6 @@ static const char *find_author_by_nickname(const char *name)
>>   		ctx.date_mode.type = DATE_NORMAL;
>>   		strbuf_release(&buf);
>>   		format_commit_message(commit, "%aN <%aE>", &buf, &ctx);
>> -		clear_mailmap(&mailmap);
>>   		release_revisions(&revs);
>>   		return strbuf_detach(&buf, NULL);
>>   	}
>> diff --git a/revision.c b/revision.c
>> index 553f7de8250..622f0faecc4 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -2926,10 +2926,19 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
>>   	return left;
>>   }
>>   +static void release_revisions_mailmap(struct string_list
>> *mailmap)
>> +{
>> +	if (!mailmap)
>> +		return;
>> +	clear_mailmap(mailmap);
>> +	free(mailmap);
>> +}
>
> It's not a big issue but if there are no other users of this then it
> could just go inside release_revisions, my impression is that this 
> series builds a collection of very small functions whose only caller
> is release_revisions()

Yes, these are just trivial static helpers so that each line in
release_revisions() corresponds to a member of the struct, without
loops, indentation for "don't free this" etc.

To the machine code it makes no difference at higher optimization
levels.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux