Hi Junio, On Wed, 16 Dec 2015, Junio C Hamano wrote: > I inspected all the callsites of this function to see if it is safe > to use such an updated logic at these callsites, and did not find > anything problematic. I could update strbuf_getline() in place, but > just to be extra careful, this series instead introduces another > helper, strbuf_getline_crlf(), that is aware of this CRLF business, > and convert the ones that are safe to update as we verify. > > * This series converts only the callers of strbuf_getline() that > would misbehave when fed a file with CRLF-terminated lines and > use the data with an unwanted CR appended at the end. With the > update the code should work as intended with such a file, without > breaking the expected behaviour when working on a file with > LF-terminated lines. > > * Callers of strbuf_getline() that expect to only read from our own > output do not have to accommodate CRLF-terminated lines, but they > can be updated to do so safely if we do not rely on the ability > to express a payload that has CR at the end. For example, the > insn sheet "rebase -i" uses typically ends each line with a > commit log summary that we ourselves do not read or use, so even > if the end-user on a platform with LF lines deliberately does > insert \r at the end of the line and strbuf_gets() removed that > \r from the payload, no unexpected behaviour should happen. > > This series does not touch them, but it may be a good GSoC > microproject to convert them to use strbuf_getline_crlf(). > > * Callers of strbuf_getline() that call strbuf_trim() immediately > on the result before doing anything, or otherwise have logic that > tolerates whitespaces at the end of the line, can continue using > strbuf_getline() and will not misbehave on CRLF-terminated lines. > > This series does not touch them; converting them to use > strbuf_getline_crlf() would a good way to document them as > dealing with "text". While doing so, they can also lose their > custom logic that happens to make CRLF-terminated lines work. > > * A caller of strbuf_getline() that wants to only see LF-terminated > lines (in other words, "ABC\r\n" must be treated as a line with 4 > bytes payload on it, A B C and CR), would be broken if we replace > it with strbuf_getline_crlf(). This series does not touch them, > and no follow-up series should, either. Thanks for the detailed explanation. I totally agree that (2) would make for a good micro-project. While I am pretty convinced that strbuf_getline() that retains a CR makes no sense, it is obviously the correct thing to prove that rather than assume it (after all, some caller might exploit strbuf_getline() for its auto-growing capabilities even on a binary stream when it is known that it cannot contain a 0x0a). And please accept my gratitude for tackling this project. Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html