Re: [PATCH 2/2] strbuf: allow to use preallocated memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/31/2016 08:41 AM, Johannes Schindelin wrote:
> Hi Michael,
> 
> On Tue, 31 May 2016, Michael Haggerty wrote:
> 
>> On 05/30/2016 02:13 PM, Johannes Schindelin wrote:
>>> [...]
>>>> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>>>>  {
>>>>  	char *res;
>>>>  	strbuf_grow(sb, 0);
>>>> -	res = sb->buf;
>>>> +	if (sb->flags & STRBUF_OWNS_MEMORY)
>>>> +		res = sb->buf;
>>>> +	else
>>>> +		res = xmemdupz(sb->buf, sb->alloc - 1);
>>>
>>> This looks like a usage to be avoided: if we plan to detach the buffer,
>>> anyway, there is no good reason to allocate it on the heap first. I would
>>> at least issue a warning here.
>>
>> I think this last case should be changed to
>>
>>     res = xmemdupz(sb->buf, sb->len);
>>
>> Johannes, if this change is made then I think that there is a reasonable
>> use case for calling `strbuf_detach()` on a strbuf that wraps a
>> stack-allocated string, so I don't think that a warning is needed.
>>
>> I think this change makes sense. After all, once a caller detaches a
>> string, it is probably not planning on growing/shrinking it anymore, so
>> any more space than that would probably be wasted. In fact, since the
>> caller has no way to ask how much extra space the detached string has
>> allocated, it is almost guaranteed that the space would be wasted.
> 
> Ah, I did not think of that. (We could of course introduce
> strbuf_realloc_snugly() or some such, but I agree: why?) So the best
> counter-example to my objection might be something like this:
> 
> -- snip --
> Subject: git_pathdup(): avoid realloc()
> 
> When generating Git paths, we already know the maximum length:
> MAX_PATH. Let's avoid realloc()-caused fragmentation by using a
> fixed buffer.
> 
> As a side effect, we avoid overallocating the end result, too:
> previously, strbuf_detach() would not realloc() to avoid wasting
> the (sb->alloc - sb->len) bytes, while it malloc()s the precisely
> needed amount of memory when detaching fixed buffers.
> 
> diff --git a/path.c b/path.c
> index 2511d8a..64fd3ee 100644
> --- a/path.c
> +++ b/path.c
> @@ -426,7 +426,8 @@ const char *git_path(const char *fmt, ...)
>  
>  char *git_pathdup(const char *fmt, ...)
>  {
> -	struct strbuf path = STRBUF_INIT;
> +	char buffer[MAX_PATH];
> +	struct strbuf path = STRBUF_WRAP_FIXED(buffer);
>  	va_list args;
>  	va_start(args, fmt);
>  	do_git_path(&path, fmt, args);
> -- snap --

I like the idea, but apparently MAX_PATH is not a hard limit that OSs
must respect (e.g., see [1]). The existing implementation can generate
paths longer than MAX_PATH whereas your replacement cannot.

This would be easy to solve by using STRBUF_WRAP_MOVABLE() instead of
STRBUF_WRAP_FIXED() in your patch. We would thereby accept the
possibility that the strbuf might need to be expanded for really long
paths. However, if that happens, then strbuf_detach() is likely to
return the string in an overallocated memory area.

So if there are callers who *really* care about not having overallocated
memory, we might want a strbuf_detach_snugly().

Your (only half-serious) idea for strbuf_realloc_snugly() would have the
disadvantage that it couldn't work with a fixed strbuf, whereas
strbuf_detach_snugly() could. It would be sad for callers to have to
worry about the distinction.

Michael

[1] http://insanecoding.blogspot.de/2007/11/pathmax-simply-isnt.html

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]