Re: [PATCH] config: support values longer than 1024 bytes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 6, 2011 at 5:35 PM, Jeff King <peff@xxxxxxxx> wrote:
> On Wed, Apr 06, 2011 at 11:10:42AM +0200, Erik Faye-Lund wrote:
>
>> > But for the other, one of the invariants of strbuf is that the string is
>> > always NUL-terminated. So I would expect strbuf_init to properly
>> > NUL-terminate after growing based on the hint.
>>
>> I agree. An unterminated yet non-NULL return from strbuf_detach is
>> just dangerous behavior. Something like this should probably be
>> applied:
>>
>> ---8<---
>> diff --git a/strbuf.c b/strbuf.c
>> index 77444a9..538035a 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -24,14 +24,16 @@ int suffixcmp(const char *str, const char *suffix)
>>   * buf is non NULL and ->buf is NUL terminated even for a freshly
>>   * initialized strbuf.
>>   */
>> -char strbuf_slopbuf[1];
>> +char strbuf_slopbuf[1] = { '\0' };
>
> This hunk is redundant. slopbuf will already be initialized to 0.

Right, silly me. I somehow thought that implicit zero-initialization
only applied to static variables, but K&R tells me I was wrong. Thanks
for pointing it out :)

>
>>  void strbuf_init(struct strbuf *sb, size_t hint)
>>  {
>>       sb->alloc = sb->len = 0;
>>       sb->buf = strbuf_slopbuf;
>> -     if (hint)
>> +     if (hint) {
>>               strbuf_grow(sb, hint);
>> +             sb->buf[0] = '\0';
>> +     }
>>  }
>
> But this one is the right fix.
>

OK, I'll turn this into a two-patch series, then.

>> that. But this brings a new issue: leaving potentially huge blocks of
>> memory (especially since this patch is about long lines) allocated
>> inside a function can be a bit nasty. But it's probably not a big deal
>
> Yeah. It's just one block, though, and in the normal case it is probably
> only about 80 characters. So it is more efficient than what's there now. :)
>
> Somebody could have some gigantic value, though, and yes, we'll grow to
> the biggest one and never free that memory. You could also have
> parse_value take a strbuf parameter to output into, and then free it
> after config reading is done.
>
>> In other words: I think you're right, it's a much better approach.
>> Less allocations, less penalty on the start-up time for every little
>> git-command.
>
> I doubt the efficiency increase is measurable. We end up xstrdup'ing
> quite a few of the values in the config callbacks anyway. I would do
> whatever seems most natural for reading/writing the code.
>

I think I'll just leave the single leak in. Since it would allocate
the buffer lazily, it would really only "waste" more memory than the
existing implementation when the old implementation would fail. So my
conscience is clear ;)

>> > I do wonder, though, if we could be reusing the unquote_c_style()
>> > function in quote.c. They are obviously similar, but I haven't checked
>> > if there is more going on in the config code.
>>
>> Hmm, this is an interesting suggestion. It would be a part of a bigger
>> change though: unquote_c_style requires it's input to be in memory,
>> while parse_value uses a function called get_next_char to feed the
>> parser. So we'd either have to read the entire file into memory, or
>> find some way to read the file line-by-line while handling \n-escaping
>> correctly.
>>
>> It also seems like there's differences in what kind of escaping and
>> normalization the two functions handle; unquote_c_style handles more
>> escaped character sequences, while parse_value normalize all
>> non-escaped space-characters ('\t' et. al) into SP. This might not be
>> such a big problem in reality.
>
> This was just a random thought that I had, and I didn't investigate it
> how hard it would be.  If it turns out to be too much trouble, just
> forget it.
>

Yeah. I think I'll skip it, since it isn't dead obvious.

Thanks for the input, I'll cook up a new version soon.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]