RE: git clone corrupts file.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, thanks for all the help.  
I think with the path in .gitattributes It will be fine.  

dir/sub/path/*.ini text eol=crlf working-tree-encoding=UTF-16LE-BOM

I will give those a try and see how it works out.   And especially thanks for the help advice on add -renormalize.   I would never have done that.  


Thanks, 

Scott Russell
Staff SW Engineer 
NCR Corporation 
Phone: +17706237512
Scott.Russell2@xxxxxxx  |  ncr.com
       

-----Original Message-----
From: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> 
Sent: Monday, August 16, 2021 6:20 PM
To: Russell, Scott <Scott.Russell2@xxxxxxx>
Cc: Jeff King <peff@xxxxxxxx>; git@xxxxxxxxxxxxxxx
Subject: Re: git clone corrupts file.

*External Message* - Use caution before opening links or attachments

On 2021-08-16 at 22:04:20, Russell, Scott wrote:
> Thanks Brian,
> 
> I appreciate the guidance.   All our .h files can call be converted to ANSI.   I don't know why we seemed to have just one saved as Unicode.
> But it was a wakeup, and led to discovery of other files not correct.
> 
> Upon reading the help on .gitattributes, I was reminded that Windows Visual Studio can save some .rc files as Unicode.
> I think that most all are ANSI but that leaves the possible result that any one saved as Unicode could unexpectedly fail compiling due to the conversion.

I do want to specify a distinction here.  You're referring to "Unicode"
and "ANSI", which traditionally mean, on Windows, little-endian UTF-16 with BOM and Windows-1252.  You do not generally want Windows-1252, or the encoding on which it's based, ISO-8859-1.  Those are obsolete and have been for well over a decade.  It's unfortunate that many Windows programs continue to use these terms, because neither "Unicode" nor "ANSI" describe an actual character set according to IANA.

What is going to work best here is UTF-8 without a BOM.  Most Windows programs can handle that these days, but some still don't.  If you try to save things as "ANSI" without a working-tree-encoding and they aren't completely ASCII files, then you will end up with some weird diff output at the very least.

If the files are completely ASCII, then no working-tree-encoding is necessary, because ASCII is a subset of UTF-8.

> We have a mix of *.ini files which are a mix of mostly ANSI and more than a few others are Unicode.
> I don't know how to handle a mixture.
> 
> Perhaps I will have to specify
> 
> *.ini -text.
> 
> Unless, does .gitattributes allow paths to be specified?  In effect 
> use the
> 
> Path/path/path/*  text lf=crlf working-tree-encoding=UTF-16LE-BOM

Yes, this syntax is allowed.  See the gitattributes(5) manual page for what's allowed.  You can even do this:

dir/sub/path/*.ini text eol=crlf working-tree-encoding=UTF-16LE-BOM

One thing I forgot to mention is that after modifying your .gitattributes file, you'll want to run "git add --renormalize ." and then commit both the .gitattributes file and any changes.  Otherwise, you may end up with files that don't end up converted the way that you want.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux