Re: what file extensions must be explicitly configured with respect to eol-type in gitattributes?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 20, 2010 at 6:19 PM, Robert Buck <buck.robert.j@xxxxxxxxx> wrote:
> Hello,
>
> One project we have in house has approximately 160 different file
> extensions used for the files checked in. In our repository there are
> files that MUST be CRLF (.bat, .cmd, .vcproj, etc), files that MUST be
> LF (.xml, .xsl, .sh, etc), and files that MUST be binary. All others
> are just text and so long as they appear in native form I'd be happy.
>
> It would seem a default rule to handle text files would make sense:
>
> * text=auto
>
> But I have not found material explaining how git identifies binary
> files, so one concern would be that it could mangle binary file types
> in some cases.
>
> Do I have to explicitly mention all 160 file types in the gitattributes file?

If you want to be 100% sure, yes.

> How does git internally determine whether a file is text vs binary?
> Does it use the 'file' command in Unix?

No. Git applies the following heuristics to the file content:
- Does the file contain any NUL-characters? If so, it's binary.
- Is the ratio of printable characters vs non-printable characters
(when interpreted as ascii) below 128? If so, it's binary.
- Otherwise, it's text.

You can find the exact function here (beware of wrapping):
http://git.kernel.org/?p=git/git.git;a=blob;f=convert.c;h=01de9a84c21b31a0120065a32a386f27321cdf7b;hb=HEAD#l77

In general, this works pretty well. In addition, there's the
core.safecrlf configuration variable, which can be used to protect you
against normalizing the file in such a way that the exact original
file can't be recovered.

> And where I am going with this specifically is a question: what rules
> MUST be specifically stated in gitattributes and what rules are there
> implicitly?

Given the above information, you should be able to figure this one out
for yourself. The answer depends on how pedantic you are ;)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]