On Wed, Apr 06, 2011 at 11:10:42AM +0200, Erik Faye-Lund wrote: > > But for the other, one of the invariants of strbuf is that the string is > > always NUL-terminated. So I would expect strbuf_init to properly > > NUL-terminate after growing based on the hint. > > I agree. An unterminated yet non-NULL return from strbuf_detach is > just dangerous behavior. Something like this should probably be > applied: > > ---8<--- > diff --git a/strbuf.c b/strbuf.c > index 77444a9..538035a 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -24,14 +24,16 @@ int suffixcmp(const char *str, const char *suffix) > * buf is non NULL and ->buf is NUL terminated even for a freshly > * initialized strbuf. > */ > -char strbuf_slopbuf[1]; > +char strbuf_slopbuf[1] = { '\0' }; This hunk is redundant. slopbuf will already be initialized to 0. > void strbuf_init(struct strbuf *sb, size_t hint) > { > sb->alloc = sb->len = 0; > sb->buf = strbuf_slopbuf; > - if (hint) > + if (hint) { > strbuf_grow(sb, hint); > + sb->buf[0] = '\0'; > + } > } But this one is the right fix. > that. But this brings a new issue: leaving potentially huge blocks of > memory (especially since this patch is about long lines) allocated > inside a function can be a bit nasty. But it's probably not a big deal Yeah. It's just one block, though, and in the normal case it is probably only about 80 characters. So it is more efficient than what's there now. :) Somebody could have some gigantic value, though, and yes, we'll grow to the biggest one and never free that memory. You could also have parse_value take a strbuf parameter to output into, and then free it after config reading is done. > In other words: I think you're right, it's a much better approach. > Less allocations, less penalty on the start-up time for every little > git-command. I doubt the efficiency increase is measurable. We end up xstrdup'ing quite a few of the values in the config callbacks anyway. I would do whatever seems most natural for reading/writing the code. > > I do wonder, though, if we could be reusing the unquote_c_style() > > function in quote.c. They are obviously similar, but I haven't checked > > if there is more going on in the config code. > > Hmm, this is an interesting suggestion. It would be a part of a bigger > change though: unquote_c_style requires it's input to be in memory, > while parse_value uses a function called get_next_char to feed the > parser. So we'd either have to read the entire file into memory, or > find some way to read the file line-by-line while handling \n-escaping > correctly. > > It also seems like there's differences in what kind of escaping and > normalization the two functions handle; unquote_c_style handles more > escaped character sequences, while parse_value normalize all > non-escaped space-characters ('\t' et. al) into SP. This might not be > such a big problem in reality. This was just a random thought that I had, and I didn't investigate it how hard it would be. If it turns out to be too much trouble, just forget it. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html