On Fri, Feb 02, 2018 at 11:17:04AM -0800, Junio C Hamano wrote: > Torsten Bögershausen <tboegi@xxxxxx> writes: > > > There are 2 opposite opionions/user expectations here: > > > > a) They are binary in the working tree, so git should leave the line endings > > as is. (Unless specified otherwise in the .attributes file) > > ... > > b) They are text files in the index. Git will convert line endings > > if core.autocrlf is true (or the .gitattributes file specifies "-text") > > I sense that you seem to be focusing on the distinction between "in > the working tree" vs "in the index" while contrasting. The "binary > vs text" in your "binary in wt, text in index" is based on the > default heuristics without any input from end-users or the project > that uses Git that happens to contain such files. If the users and > the project that uses Git want to treat contents in a path as text, > it is text even when it is (re-)encoded to UTF-16, no? > > Such files may be (mis)classified as binary with the default > heuristics when there is no help from what is written in the > .gitattributes file, but here we are talking about the case where > the user explicitly tells us it is in UTF-16, right? Is there such a > thing as UTF-16 binary? I don't think so, by definiton UTF-16 is ment to be text. (this means that git ls-files --eol needs some update, I can have a look) Do we agree that UTF-16 is text ? If yes, could Git assume that the "text" attribute is set when working-tree-encoding is set ? I would even go a step further and demand that the user -must- make a decision about the line endings for working-tree-encoded files: working-tree-encoding=UTF-16 # illegal, die() working-tree-encoding=UTF-16 text=auto # illegal, die() working-tree-encoding=UTF-16 -text # no eol conversion working-tree-encoding=UTF-16 text # eol according to core.eol working-tree-encoding=UTF-16 text eol=lf # LF working-tree-encoding=UTF-16 text eol=crlf # CRLF What do you think ?