On 03/06/2022 14:14, Philip Oakley wrote: > On 01/06/2022 11:07, Philip, Bevan wrote: >> Hey Philip, >> >> Thanks for the response! >> >>> ... however, if I remember the design discussion correctly, normalisation was decided to be just the conversion of the Windows style EOL = `\r\n` to the Linux/*nix style EOL =`\n`, and any other characters >>> (utf8 / ascii bytes) were to be unchanged, including random '\r' >>> characters. So in that respect I think it is working as initially designed. >> This makes sense. >> >>> Do you have any information on how the mixed EOL styles (extra \r etc) came about? >> I wish I knew how this file came about, but the people that put these files in our VCS have long left. I suspect some broken generation tool. > I vaguely remember tales that early Macs use \r as their EOL character, > so may have been that. >>> Should those extra \r characters also be separate EOLs? (and how to >>> decide..?) >> Most tooling I use seems to do this, but I agree that this is an ambiguous topic. > maybe an extra `sed` invocation changing all the \r to \n in such cases! It looks like StackOverflow has an answer https://stackoverflow.com/a/42914886/717355 $ sed -i 's/\r/\n/g; s/\n$//' for the all-at-once conversion filter using sed (with explanation!). I believe its idempotent (great word to know ;-) >>> Are the docs missing anything that would have helped clarify the issue earlier? >> A brief note on the limitations of renormalization might have proven helpful > I'll maybe add that to my list of todo's (though it's a bit long and > aspirational;-) > >> - in particular, the bit that tripped me up was the requirement to remove and restore the files from the Git repository itself. > I think it's just a checkout and then an `add` of the renormalised files > `git add --renormalize . ` (not forgetting the all important `dot`), > though some may have termed the checkout as the files being 'removed'. > > I did notice (when cross checking a few points) that there is also a > `merge.renormalize` config option that will then make sure that when > branches are merged you get the required re-normalisation (check the man > pages ..). > >> It wasn't obvious to me that this would have any impact on renormalization. Additionally, a note about the restriction on converting only \r\n to \n might also have proven useful. > OK. > > PS, in-line replies preferred on the list. >> Thanks, >> Bevan >> >> >> -----Original Message----- >> From: Philip Oakley <philipoakley@iee.email> >> Sent: 31 May 2022 22:12 >> To: Philip, Bevan <Bevan.Philip@xxxxxxxxxxxxxx>; git@xxxxxxxxxxxxxxx >> Subject: Re: Files with \r\n\n line endings can result in needing to renormalize twice, after deleting checked out file and restoring from repo >> >> On 31/05/2022 15:24, Philip, Bevan wrote: >>> Hello all, >>> >>> I've experienced an odd bug/limitation with `git add --renormalize`, requiring me to run the command twice on a specific file. Here is a bug report. >>> >>> What did you do before the bug happened? (Steps to reproduce your >>> issue) >>> >>> #!/bin/bash -x >>> printf "Test\\r\\r\\nTest Another Line\\r\\r\\nFinal >>> Line\\r\\r\\n\\r\\r\\n" > git.bdf printf "* text=auto\\n*.bdf text" > >>> .gitattributes mkdir test1 cd test1 git init cp ../git.bdf . >>> git add . >>> git status >>> git commit -m "Add file git.bdf" >>> cp ../.gitattributes . >>> git add .gitattributes >>> git add --renormalize . >>> git status >>> git commit -m "Renormalize git.bdf" >>> git add --renormalize . >>> git status >>> rm git.bdf >>> git restore . >>> git add --renormalize . >>> git status >>> >>> What did you expect to happen? (Expected behavior) Only needing to >>> renormalize the file once. >> That sounds like an obvious expectation, ... >>> What happened instead? (Actual behavior) Renormalize the file once, >>> then renormalize again after deleting the file that is checked out on disk and restoring it from the object stored within the Git repo. >>> >>> What's different between what you expected and what actually happened? >>> Needed to run the renormalize step again, after deleting the file checked out on disk and restoring the file from the object stored within the Git repo. >>> >>> Anything else you want to add: >>> This only occurs for files with \r\r\n line endings (and possibly also >>> ending the file with \r\r\n\r\n) >> ... however, if I remember the design discussion correctly, normalisation was decided to be just the conversion of the Windows style EOL = `\r\n` to the Linux/*nix style EOL =`\n`, and any other characters >> (utf8 / ascii bytes) were to be unchanged, including random '\r' >> characters. So in that respect I think it is working as initially designed. >> >>> The file is in three states: >>> - Initial state: \r\r\n line endings within Git object >>> - Initial renormalization state: \r\n line endings within Git object >>> - Second renormalization state: \n line endings within Git object >>> >>> Happens on both Windows and Linux (replicated on a fresh install of Git for Windows within Windows Sandbox). Additionally, tested with `next` trunk on Linux. >>> System info is for a Windows build where it does happen. >>> >>> Directory, and file names should be irrelevant. >>> >>> We encountered this naturally, with some files within a SVN repo we're migrating. >> Do you have any information on how the mixed EOL styles (extra \r etc) came about? >> Should those extra \r characters also be separate EOLs? (and how to >> decide..?) >> Are the docs missing anything that would have helped clarify the issue earlier?