Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 08.12.2017 um 11:14 schrieb Jeff King:
> On Thu, Dec 07, 2017 at 01:47:14PM -0800, Junio C Hamano wrote:
> 
>>> diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
>>> index 22034f87e7..8e8a15ea4a 100644
>>> --- a/builtin/fmt-merge-msg.c
>>> +++ b/builtin/fmt-merge-msg.c
>>> @@ -377,7 +377,8 @@ static void shortlog(const char *name,
>>>   			string_list_append(&subjects,
>>>   					   oid_to_hex(&commit->object.oid));
>>>   		else
>>> -			string_list_append(&subjects, strbuf_detach(&sb, NULL));
>>> +			string_list_append_nodup(&subjects,
>>> +						 strbuf_detach(&sb, NULL));
>>>   	}
>>>   
>>>   	if (opts->credit_people)
>>
>> What is leaked comes from strbuf, so the title is not a lie, but I
>> tend to think that this leak is caused by a somewhat strange
>> string_list API.  The subjects string-list is initialized as a "dup"
>> kind, but a caller that wants to avoid leaking can (and should) use
>> _nodup() call to add a string without duping.  It all feels a bit
>> too convoluted.
> 
> I'm not sure it's string-list's fault. Many callers (including this one)
> have _some_ entries whose strings must be duplicated and others which do
> not.
> 
> So either:
> 
>    1. The list gets marked as "nodup", and we add an extra xstrdup() to the
>       oid_to_hex call above. And also need to remember to free() the
>       strings later, since the list does not own them.
> 
> or
> 
>    2. We mark it as "dup" and incur an extra allocation and copy, like:
> 
>         string_list_append(&subjects, sb.buf);
>         strbuf_release(&buf);

The two modes (dup/nodup) make string_list code tricky.  Not sure
how far we'd get with something simpler (e.g. an array of char pointers),
but having the caller do all string allocations would make the code
easier to analyze.

> So I'd really blame the caller, which doesn't want to do (2) out of a
> sense of optimization. It could also perhaps write it as:
> 
>    while (commit = get_revision(rev)) {
> 	strbuf_reset(&sb);
> 	... maybe put some stuff in sb ...
> 	if (!sb.len)
> 		string_list_append(&subjects, oid_to_hex(obj));
> 	else
> 		string_list_append(&subjects, sb.buf);
>    }
>    strbuf_release(&sb);
> 
> which at least avoids the extra allocations.

Right, we'd just have extra string copies in that case.

> By the way, I think there's another quite subtle leak in this function.
> We do this:
> 
>    format_commit_message(commit, "%s", &sb, &ctx);
>    strbuf_ltrim(&sb);
> 
> and then only use "sb" if sb.len is non-zero. But we may have actually
> allocated to create our zero-length string (e.g., if we had a strbuf
> full of spaces and trimmed them all off). Since we reuse "sb" over and
> over as we loop, this will actually only leak once for the whole loop,
> not once per iteration. So it's probably not a big deal, but writing it
> with the explicit reset/release pattern fixes that (and is more
> idiomatic for our code base, I think).

It's subtle, but I think it's not leaking, at least not in your example
case (and I can't think of another way).  IIUC format_subject(), which
handles the "%s" part, doesn't touch sb if the subject is made up only
of whitespace.

René



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux