Re: [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 02, 2018 at 11:17:04AM -0800, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@xxxxxx> writes:
> 
> > There are 2 opposite opionions/user expectations here:
> >
> > a) They are binary in the working tree, so git should leave the line endings
> >    as is. (Unless specified otherwise in the .attributes file)
> > ...
> > b) They are text files in the index. Git will convert line endings
> >    if core.autocrlf is true (or the .gitattributes file specifies "-text")
> 
> I sense that you seem to be focusing on the distinction between "in
> the working tree" vs "in the index" while contrasting.  The "binary
> vs text" in your "binary in wt, text in index" is based on the
> default heuristics without any input from end-users or the project
> that uses Git that happens to contain such files.  If the users and
> the project that uses Git want to treat contents in a path as text,
> it is text even when it is (re-)encoded to UTF-16, no?
> 
> Such files may be (mis)classified as binary with the default
> heuristics when there is no help from what is written in the
> .gitattributes file, but here we are talking about the case where
> the user explicitly tells us it is in UTF-16, right?  Is there such a
> thing as UTF-16 binary?

I don't think so, by definiton UTF-16 is ment to be text.
(this means that git ls-files --eol needs some update, I can have a look)

Do we agree that UTF-16 is text ?
If yes, could Git assume that the "text" attribute is set when
working-tree-encoding is set ?

I would even go a step further and demand that the user -must- make a decision
about the line endings for working-tree-encoded files:
working-tree-encoding=UTF-16                 # illegal, die()
working-tree-encoding=UTF-16 text=auto       # illegal, die()
working-tree-encoding=UTF-16 -text           # no eol conversion
working-tree-encoding=UTF-16 text            # eol according to core.eol
working-tree-encoding=UTF-16 text eol=lf     # LF
working-tree-encoding=UTF-16 text eol=crlf   # CRLF

What do you think ?






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux