On Fri, Nov 24, 2017 at 12:24:48PM -0500, Eric Sunshine wrote: > On Fri, Nov 24, 2017 at 11:14 AM, <tboegi@xxxxxx> wrote: > > When a text file had been commited with CRLF and the file is commited > > again, the CRLF are kept if .gitattributs has "text=auto". > > This is done by analyzing the content of the blob stored in the index: > > If a '\r' is found, Git assumes that the blob was commited with CRLF. > > > > The simple search for a '\r' does not always work as expected: > > A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary. > > Now the content is converted into UTF-8. At the next commit Git treats the > > file as text, the CRLF should be converted into LF, but isn't. > > > > Solution: > > Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found, > > 0 is returned directly, this is the most common case. > > If a '\r' is found, the content is analyzed more deeply. > > > > Signed-off-by: Torsten Bögershausen <tboegi@xxxxxx> > > --- > > diff --git a/convert.c b/convert.c > > @@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action, > > -static int has_cr_in_index(const struct index_state *istate, const char *path) > > +static int has_crlf_in_index(const struct index_state *istate, const char *path) > > { > > unsigned long sz; > > void *data; > > - int has_cr; > > + const char *crp; > > + int has_crlf = 0; > > > > data = read_blob_data_from_index(istate, path, &sz); > > if (!data) > > return 0; > > - has_cr = memchr(data, '\r', sz) != NULL; > > + > > + crp = memchr(data, '\r', sz); > > + if (crp && (crp[1] == '\n')) { > > If I understand correctly, this isn't a NUL-terminated string and it > might be a binary blob, so if the lone CR in a file resides at the end > of the file, won't this try looking for LF out-of-bounds? I would have > expected the conditional to be: > > if (crp && crp - data + 1 < sz && crp[1] == '\n') { > > or any equivalent variation. > The read_blob_data_from_index() function should always append a '\0', regardless if the blob is binary or not.