On Fri, Nov 24, 2017 at 11:14 AM, <tboegi@xxxxxx> wrote: > When a text file had been commited with CRLF and the file is commited > again, the CRLF are kept if .gitattributs has "text=auto". > This is done by analyzing the content of the blob stored in the index: > If a '\r' is found, Git assumes that the blob was commited with CRLF. > > The simple search for a '\r' does not always work as expected: > A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary. > Now the content is converted into UTF-8. At the next commit Git treats the > file as text, the CRLF should be converted into LF, but isn't. > > Solution: > Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found, > 0 is returned directly, this is the most common case. > If a '\r' is found, the content is analyzed more deeply. > > Signed-off-by: Torsten Bögershausen <tboegi@xxxxxx> > --- > diff --git a/convert.c b/convert.c > @@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action, > -static int has_cr_in_index(const struct index_state *istate, const char *path) > +static int has_crlf_in_index(const struct index_state *istate, const char *path) > { > unsigned long sz; > void *data; > - int has_cr; > + const char *crp; > + int has_crlf = 0; > > data = read_blob_data_from_index(istate, path, &sz); > if (!data) > return 0; > - has_cr = memchr(data, '\r', sz) != NULL; > + > + crp = memchr(data, '\r', sz); > + if (crp && (crp[1] == '\n')) { If I understand correctly, this isn't a NUL-terminated string and it might be a binary blob, so if the lone CR in a file resides at the end of the file, won't this try looking for LF out-of-bounds? I would have expected the conditional to be: if (crp && crp - data + 1 < sz && crp[1] == '\n') { or any equivalent variation. > + unsigned int ret_stats; > + ret_stats = gather_convert_stats(data, sz); > + if (!(ret_stats & CONVERT_STAT_BITS_BIN) && > + (ret_stats & CONVERT_STAT_BITS_TXT_CRLF)) > + has_crlf = 1; > + } > free(data); > - return has_cr; > + return has_crlf; > }