Re: Files with \r\n\n line endings can result in needing to renormalize twice, after deleting checked out file and restoring from repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/06/2022 11:07, Philip, Bevan wrote:
> Hey Philip,
>
> Thanks for the response!
>
>> ... however, if I remember the design discussion correctly, normalisation was decided to be just the conversion of the Windows style EOL = `\r\n` to the Linux/*nix style EOL =`\n`, and any other characters
>> (utf8 / ascii bytes) were to be unchanged, including random '\r'
>> characters. So in that respect I think it is working as initially designed.
> This makes sense.
>
>> Do you have any information on how the mixed EOL styles (extra \r etc) came about?
> I wish I knew how this file came about, but the people that put these files in our VCS have long left. I suspect some broken generation tool.

I vaguely remember tales that early Macs use \r as their EOL character,
so may have been that.
>
>> Should those extra \r characters also be separate EOLs? (and how to
>> decide..?)
> Most tooling I use seems to do this, but I agree that this is an ambiguous topic.
maybe an extra `sed` invocation changing all the \r to \n in such cases!
>
>> Are the docs missing anything that would have helped clarify the issue earlier?
> A brief note on the limitations of renormalization might have proven helpful
I'll maybe add that to my list of todo's (though it's a bit long and
aspirational;-)

>  - in particular, the bit that tripped me up was the requirement to remove and restore the files from the Git repository itself.
I think it's just a checkout and then an `add` of the renormalised files
`git add --renormalize . ` (not forgetting the all important `dot`),
though some may have termed the checkout as the files being 'removed'.

I did notice (when cross checking a few points) that there is also a
`merge.renormalize` config option that will then make sure that when
branches are merged you get the required re-normalisation (check the man
pages ..).

>  It wasn't obvious to me that this would have any impact on renormalization. Additionally, a note about the restriction on converting only \r\n to \n might also have proven useful.

OK.

PS, in-line replies preferred on the list.
>
> Thanks,
> Bevan
>
>
> -----Original Message-----
> From: Philip Oakley <philipoakley@iee.email>
> Sent: 31 May 2022 22:12
> To: Philip, Bevan <Bevan.Philip@xxxxxxxxxxxxxx>; git@xxxxxxxxxxxxxxx
> Subject: Re: Files with \r\n\n line endings can result in needing to renormalize twice, after deleting checked out file and restoring from repo
>
> On 31/05/2022 15:24, Philip, Bevan wrote:
>> Hello all,
>>
>> I've experienced an odd bug/limitation with `git add --renormalize`, requiring me to run the command twice on a specific file. Here is a bug report.
>>
>> What did you do before the bug happened? (Steps to reproduce your
>> issue)
>>
>> #!/bin/bash -x
>> printf "Test\\r\\r\\nTest Another Line\\r\\r\\nFinal
>> Line\\r\\r\\n\\r\\r\\n" > git.bdf printf "* text=auto\\n*.bdf text" >
>> .gitattributes mkdir test1 cd test1 git init cp ../git.bdf .
>> git add .
>> git status
>> git commit -m "Add file git.bdf"
>> cp ../.gitattributes .
>> git add .gitattributes
>> git add --renormalize .
>> git status
>> git commit -m "Renormalize git.bdf"
>> git add --renormalize .
>> git status
>> rm git.bdf
>> git restore .
>> git add --renormalize .
>> git status
>>
>> What did you expect to happen? (Expected behavior) Only needing to
>> renormalize the file once.
> That sounds like an obvious expectation, ...
>> What happened instead? (Actual behavior) Renormalize the file once,
>> then renormalize again after deleting the file that is checked out on disk and restoring it from the object stored within the Git repo.
>>
>> What's different between what you expected and what actually happened?
>> Needed to run the renormalize step again, after deleting the file checked out on disk and restoring the file from the object stored within the Git repo.
>>
>> Anything else you want to add:
>> This only occurs for files with \r\r\n line endings (and possibly also
>> ending the file with \r\r\n\r\n)
> ... however, if I remember the design discussion correctly, normalisation was decided to be just the conversion of the Windows style EOL = `\r\n` to the Linux/*nix style EOL =`\n`, and any other characters
> (utf8 / ascii bytes) were to be unchanged, including random '\r'
> characters. So in that respect I think it is working as initially designed.
>
>> The file is in three states:
>> - Initial state: \r\r\n line endings within Git object
>> - Initial renormalization state: \r\n line endings within Git object
>> - Second renormalization state: \n line endings within Git object
>>
>> Happens on both Windows and Linux (replicated on a fresh install of Git for Windows within Windows Sandbox). Additionally, tested with `next` trunk on Linux.
>> System info is for a Windows build where it does happen.
>>
>> Directory, and file names should be irrelevant.
>>
>> We encountered this naturally, with some files within a SVN repo we're migrating.
> Do you have any information on how the mixed EOL styles (extra \r etc) came about?
> Should those extra \r characters also be separate EOLs? (and how to
> decide..?)
> Are the docs missing anything that would have helped clarify the issue earlier?
>> [System Info]
>> git version:
>> git version 2.36.1.windows.1
>> cpu: x86_64
>> built from commit: e2ff68a2d1426758c78d023f863bfa1e03cbc768
>> sizeof-long: 4
>> sizeof-size_t: 8
>> shell-path: /bin/sh
>> feature: fsmonitor--daemon
>> uname: Windows 10.0 19043
>> compiler info: gnuc: 11.3
>> libc info: no libc information available $SHELL (typically,
>> interactive shell): <unset>
>>
>>
> --
> Philip
> This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), please note that any distribution, copying, or use of this communication or the information in it, is strictly prohibited. If you have received this communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 - http://www.softwareag.com/uk




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux