From: Torsten Bögershausen <tboegi@xxxxxx> UTF-16 encoded files are treated as "binary" by Git, and no CRLF conversion is done. When the UTF-16 encoded files are converted into UF-8 using the new "working-tree-encoding", the CRLF are converted if core.autocrlf is true. This may lead to confusion: A tool writes an UTF-16 encoded file with CRLF. The file is commited with core.autocrlf=true, the CLRF are converted into LF. The repo is pushed somewhere and cloned by a different user, who has decided to use core.autocrlf=false. He uses the same tool, and now the CRLF are not there as expected, but LF, make the file useless for the tool. Avoid this (possible) confusion by ignoring core.autocrlf for all files which have "working-tree-encoding" defined. The user can still use a .gitattributes file and specify the line endings like "text=auto", "text", or "text eol=crlf" and let that .gitattribute file travel together with push and clone. Change convert.c to e more careful, simplify the initialization when attributes are retrived (and none are specified) and update the documentation. Signed-off-by: Torsten Bögershausen <tboegi@xxxxxx> --- Documentation/gitattributes.txt | 9 ++++++--- convert.c | 15 ++++++++++++--- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index a8dbf4be3..3665c4677 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -308,12 +308,15 @@ Use the `working-tree-encoding` attribute only if you cannot store a file in UTF-8 encoding and if you want Git to be able to process the content as text. +Note that when `working-tree-encoding` is defined, core.autocrlf is ignored. +Set the `text` attribute (or `text=auto`) to enable CRLF conversions. + Use the following attributes if your '*.txt' files are UTF-16 encoded -with byte order mark (BOM) and you want Git to perform automatic line -ending conversion based on your platform. +with byte order mark (BOM) and you want Git to perform line +ending conversion based on core.eol. ------------------------ -*.txt text working-tree-encoding=UTF-16 +*.txt working-tree-encoding=UTF-16 text ------------------------ Use the following attributes if your '*.txt' files are UTF-16 little diff --git a/convert.c b/convert.c index 13fad490c..e7f11d1db 100644 --- a/convert.c +++ b/convert.c @@ -1264,15 +1264,24 @@ static void convert_attrs(struct conv_attrs *ca, const char *path) } ca->checkout_encoding = git_path_check_encoding(ccheck + 5); } else { - ca->drv = NULL; - ca->crlf_action = CRLF_UNDEFINED; - ca->ident = 0; + memset(ca, 0, sizeof(*ca)); } /* Save attr and make a decision for action */ ca->attr_action = ca->crlf_action; if (ca->crlf_action == CRLF_TEXT) ca->crlf_action = text_eol_is_crlf() ? CRLF_TEXT_CRLF : CRLF_TEXT_INPUT; + /* + * Often UTF-16 encoded files are read and written by programs which + * really need CRLF, and it is important to keep the CRLF "as is" when + * files are committed with core.autocrlf=true and the repo is pushed. + * The CRLF would be converted into LF when the repo is cloned to + * a machine with core.autocrlf=false. + * Obey the "text" and "eol" attributes and be independent on the + * local core.autocrlf for all "encoded" files. + */ + if ((ca->crlf_action == CRLF_UNDEFINED) && ca->checkout_encoding) + ca->crlf_action = CRLF_BINARY; if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_FALSE) ca->crlf_action = CRLF_BINARY; if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_TRUE) -- 2.16.0.rc0.2.g64d3e4d0cc.dirty