On Sunday, January 18, 2015, Yi EungJun <semtlenori@xxxxxxxxx> wrote: > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Limit the number of languages to 1,000 because q-value must not be > smaller than 0.001, and limit the length of Accept-Language header to > 4,000 bytes for some HTTP servers which cannot accept such long header. > > Signed-off-by: Yi EungJun <eungjun.yi@xxxxxxxxxxxxx> > --- > diff --git a/http.c b/http.c > index 040f362..349b033 100644 > --- a/http.c > +++ b/http.c > @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +static void write_accept_language(struct strbuf *buf) > +{ > + [...] > + const int MAX_DECIMAL_PLACES = 3; > + const int MAX_LANGUAGE_TAGS = 1000; > + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; > + struct strbuf *language_tags = NULL; > + int num_langs; Mental note: 'num_langs' is not initialized. > + const char *s = get_preferred_languages(); > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!s) > + return; > + > + /* > + * Split the colon-separated string of preferred languages into > + * language_tags array. > + */ > + do { > + /* increase language_tags array to add new language tag */ > + REALLOC_ARRAY(language_tags, num_langs + 1); 'num_langs' is used here uninitialized. > + strbuf_init(&language_tags[num_langs], 0); > + > + /* collect language tag */ > + for (; *s && (isalnum(*s) || *s == '_'); s++) > + strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s); > + > + /* skip .codeset, @modifier and any other unnecessary parts */ > + while (*s && *s != ':') > + s++; > + > + if (language_tags[num_langs].len > 0) { Mental note: An empty ("") language tag is never allowed in language_tags[]. > + num_langs++; This is a little bit ugly. At the top of the loop, you allocate space in the array for a strbuf and initialize it. However, if the language tag is empty (""), then 'num_langs' is never incremented, so the next time through the loop, strbuf_init() is invoked on the same block of memory (assuming the realloc was a no-op since the allocation size did not change), overwriting whatever was there and possibly leaking memory. In this particular case, by examining the parser code closely, we can see that nothing was added to the strbuf, so nothing is being leaked the next time around, given the current implementation of strbuf. However, this is potentially fragile. A change to the implementation of strbuf in the future (for instance, if strbuf_init() allocates memory immediately) could result in a leak here. Moreover, this no-leak situation only holds true if no text at all has been added to the strbuf after strbuf_init(). If someone changes the parser in the future to operate a bit differently so that some text is added and then removed from the strbuf, even though the end result still has length is 0, then it will start leaking. One way to make this more robust would be to have a separate strbuf for collecting the language tag. When you encounter a non-empty tag, only then grow the array and initialize the new strbuf in the array. Finally, use strbuf_swap() to swap the collected language tag into the new array position. Something like this: struct strbuf tag = STRBUF_INIT; do { for (; *s && (isalnum(*s) || *s == '_'); s++) strbuf_addch(&tag, *s == '_' ? '-' : *s); [...] if (tag.len) { num_langs++; REALLOC_ARRAY(language_tags, num_langs); strbuf_init(&language_tags[num_langs], 0); strbuf_swap(&tag, &language_tags[num_langs]); if (num_langs >= ...) break; } while (...); strbuf_release(&tag); > + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ > + break; > + } > + } while (*s++); > + > + /* write Accept-Language header into buf */ > + if (num_langs >= 1) { > + int i; > + int last_buf_len; Mental note: 'last_buf_len' is not initialized. > + int max_q; > + int decimal_places; > + char q_format[32]; > + > + /* add '*' */ > + REALLOC_ARRAY(language_tags, num_langs + 1); > + strbuf_init(&language_tags[num_langs], 0); > + strbuf_addstr(&language_tags[num_langs++], "*"); > + > + /* compute decimal_places */ > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; > + decimal_places++, max_q *= 10) > + ; > + > + sprintf(q_format, ";q=0.%%0%dd", decimal_places); > + > + strbuf_addstr(buf, "Accept-Language: "); > + > + for(i = 0; i < num_langs; i++) { > + if (language_tags[i].len == 0) > + continue; The parsing code does not allow empty tags ("") in language_tags[], so this conditional is useless, isn't it? > + > + if (i > 0) > + strbuf_addstr(buf, ", "); > + > + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); This leaks the string detached from 'language_tag[i]' since strbuf_addstr() does not take ownership of it. > + if (i > 0) > + strbuf_addf(buf, q_format, max_q - i); > + > + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { > + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); 'last_buf_len' is (potentially) used here uninitialized the first time through loop. > + break; > + } > + > + last_buf_len = buf->len; > + } > + } > + > + free(language_tags); This _seems_ to be okay since strbuf_detach() was invoked for each strbuf in language_tags[], however, those strings were in fact leaked (as noted above), so it's not actually correct. > +} > + > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * This function always return non-NULL string as strbuf_detach() does. Repeating from [1]: It's not good form to describe the published API in terms of an implementation detail (strbuf_detach). Also, it would be more idiomatic in C to return NULL rather than empty string. [1]: http://article.gmane.org/gmane.comp.version-control.git/261810/ > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static const char *get_accept_language(void) > +{ > + if (!cached_accept_language) { > + struct strbuf buf = STRBUF_INIT; > + write_accept_language(&buf); > + cached_accept_language = strbuf_detach(&buf, NULL); > + } > + > + return cached_accept_language; > +} > + -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html