Re: [BUG] Git does not convert CRLF=>LF on files with \r not before \n

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Indeed, when changing the gitattributes for '* text', the replacement is OK.
Thanks for all the explanations.

At first, my use case was some source files (imported from another
VCS) with CR in different contexts:
 - lines ending with CRCRLF
 - all content in LF or CRLF but some CR that should be EOL...
 - CR in the middle of the line for no reason!
For all this, I will fix the files during import.

But when digging I found some shell or awk scripts with CR as a valid
char in search/replacement string. I know that the EOL should not be
CRLF in this case, but I don't know if this situation could happen in
DOS batch files or PowerShell scripts with CRLF EOL.

2015-04-21 21:28 GMT+02:00 Torsten Bögershausen <tboegi@xxxxxx>:
> On 2015-04-21 15.51, Alexandre Garnier wrote:
>> Here is a test:
>>
>> git init -q crlf-test
>> cd crlf-test
>> echo '*       text=auto' > .gitattributes
>> git add .gitattributes
>> git commit -q -m "Normalize EOL"
>> echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain
>> content with\r\r\n' > inline-cr.txt
>> echo "Working directory content:"
>> cat -A inline-cr.txt
>> echo
>> git add inline-cr.txt
>> echo "Indexed content:"
>> git show :inline-cr.txt | cat -A
>>
>> Result
>> ------
>> File content:
>> some content^M$
>> other ^Mcontent with CR^M$
>> content^M$
>> again content with^M^M$
>>
>> Indexed content:
>> some content^M$
>> other ^Mcontent with CR^M$
>> content^M$
>> again content with^M^M$
>>
>> Expected result
>> ---------------
>> File content:
>> some content^M$
>> other ^Mcontent with CR^M$
>> content^M$
>> again content with^M^M$
>>
>> Indexed content:
>> some content$
>> other ^Mcontent with CR$
>> content$
>> again content with^M$
>> # or even 'again content with$' for this last line
>>
>> If you remove the \r that are not at the end of the lines, EOL are
>> converted as expected:
>> File content:
>> some content^M$
>> other content with CR^M$
>> content^M$
>> again content with^M$
>>
>> Indexed content:
>> some content$
>> other content with CR$
>> content$
>> again content with$
>>
>
> First of all, thanks for the info.
>
> The current implementation of Git does an auto-detection
> if a file is text or binary.
>
> For a file which is "suspected to be text", it is expected to have either LF or CRLF as
> line endings, but a "bare CR" make Git wonder:
> Should this still be treated as a text file ?
> If yes, should the CR be kept as is, or should it be converted into LF (or CRLF) ?
>
> The current implementation may simply be explained by the fact that nobody has so far asked
> to treat this file as "text", so the implementation assumes it to be binary.
>
> (Which makes the code a little bit easier, at the time it was written)
>
> So the status of today is that you can force Git to let the CR as is,
> when you specify that the file is "text".
>
> Is there a real life problem behind it ?
> And what should happen to the CRs ?
>
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]