Re: Suggested clarification for .gitattributes reference documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> To my knowledge the binary iconv.exe (or just iconv under non-Windows) is never called from Git itself.

> I can't find a single instance of Git for Windows calling iconv.exe instead of using the corresponding library functions.

Thank you for your responses.  I think you are both right.  Git must instead call methods in libiconv-2.dll to do encoding conversions.

I have no idea why my Windows 10 PC could add a UTF-16LE with BOM file, but then fail to later successfully "decode" it, when running Git from an ordinary Command Prompt (cmd.exe).  I assume this failure was a fluke, since I cannot replicate the failure on my other (Windows 11) PC.  So I am withdrawing my concerns about:

1) Git for Windows failing to support UTF-16LE with BOM.
2) Git for Windows installer being misleading in its "recommended" PATH modification option.

As for documentation clarifications for the .gitattributes manpage at https://git-scm.com/docs/gitattributes, I still suggest adding an explicit example for UTF-16LE with BOM, and/or adding a table listing which working-tree-encoding value to use for each of the following UTF-16 text encodings:

ENCODING              'working-tree-encoding' VALUE
-------------------   -----------------------------
UTF-16LE with BOM     UTF-16LE-BOM
UTF-16BE with BOM     UTF-16
UTF-16LE no BOM       UTF-16LE
UTF-16BE no BOM       UTF-16BE

Why bother clarifying the documentation?  Because These UTF-16 encodings are commonly found on Windows systems.  Notepad supports the first two, and many Visual Studio project wizards add various files using these encodings as well.  Older versions of PowerShell saved new .ps1 scripts using UTF-16BE with BOM as the default encoding.

Also, the current .gitattributes documentation makes frequent reference to "UTF-16" as an encoding but fails to be clear that the working-tree-encoding value "UTF-16" is now only for UTF-16BE with BOM.  It would be easy to assume that the working-tree-encoding value "UTF-16" meant any UTF-16 file with a BOM (either LE or BE), which was the original meaning of this value before UTF-16LE-BOM was added to Git.

Finally, I am not sure how to use git add --renormalize to correct a UTF-16 file that was previously added incorrectly (i.e. with a missing or incorrect working-tree-encoding entry in .gitattributes).  The git add documentation at https://git-scm.com/docs/git-add implies 'renormalize' resets only the end-of-line values; however, I suspect it also re-converts text encoding when a working-tree-encoding property is set.  It would be helpful to know one way or the other.

- Michael Litwak

-----Original Message-----
From: Torsten Bögershausen <tboegi@xxxxxx> 
Sent: Friday, January 12, 2024 11:43 PM
To: Michael Litwak <michael.litwak@xxxxxxxx>
Cc: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>; git@xxxxxxxxxxxxxxx
Subject: [EXTERNAL]Re: Suggested clarification for .gitattributes reference documentation

[You don't often get email from tboegi@xxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Sat, Jan 13, 2024 at 02:56:27AM +0000, Michael Litwak wrote:
> I just installed Git for Windows 2.43.0 and noticed the installer offers three options for altering the PATH:
>
> 1) Run git from git bash only
>
> 2) Run git from git bash, cmd.exe and PowerShell (RECOMMENDED)
>
> 3) Run git from git bash, cmd.exe and PowerShell with optional utilities (warning: will override find, sort and other system utilities).
>
> It turns out iconv.exe is accessible from cmd.exe (Command Prompt) only when you take the third option.  But iconv.exe is NOT optional.  It is required for git to deal with UTF-16LE with BOM text conversions (and probably for numerous other encoding conversions).

Plese wait a second - and thanks for bringing this up.
To my knowledge the binary iconv.exe (or just iconv under non-Windows) is never called from Git itself.
Git is using iconv_open() and friends, which are all inside a library, either the C-library "libc", or "libiconv"
(not 100% sure about the naming here)

iconv.exe is not needed in everyday life, or is it ?
If yes, when ?
iconv.exe is used when you run the test-suite, to verify what Git is doing.

Could you elaborate a little bit more,
when iconv.exe is missing, and what is happening, please ?

>
> But when PATH option #2 is chosen, and iconv.exe is unreachable from a Windows Command Prompt, the git commands which call upon iconv.exe do NOT indicate the error.  The call to iconv.exe fails silently.  It is only later after you commit, push and clone the repo again that you see the encoding failures.
>
> And the warning about overriding find and sort must be taken with a grain of salt, since the Windows versions of those programs are accessed via a Windows folder which appears earlier in the PATH.
>
> So this Git for Windows installer screen is misleading.  And perhaps iconv.exe should be relocated so it is accessible even when PATH option #2 is chosen.  I intend to submit an issue on the Git for Windows issue tracker regarding this.  I'll also submit an issue about the lack of an error when running 'git add' for a UTF-16LE with BOM file under PATH option #2.
>
> Thanks,
> - Michael
>
[]






CAUTION:This email originated from outside of Nuix. Do not click links or open attachments unless you recognise the sender and know the content is safe.






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux