Re: mingw, windows, crlf/lf, and git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin wrote:
Last time I checked, the text files never had lines longer than 200 characters (I chose this intentionally large). So, it might be a good heuristic to check the maximal line length, and refuse to believe that it's text once a certain (configurable) threshold is reached.

Ciao,
Dsch
Unfortunately, on my program we have folks using text files with single lines over 60,000 characters long, these are data files. Think for example of a comma or tab separated data file saved from a spreadsheet. In this case, the files are pure ascii. So, the line length could be something else to take into account, but is not decisive by itself.

To recap, we have the following various suggestions to determine textness:

1) ratio of ascii to non-ascii characters, possibly weighting some chars more than others
2) line length
3) existence of a null (\0)
4) file name globbing
5) roundtrip ( lf(crlf(file) ) == file

I don't think any one suggestion is completely adequate for all uses, all need to be available, somehow configurable. This suggests to me a core.AutoCRLFstrategy variable that is a comma separated list of methods to use (set to a reasonable default of course that does not cause runtime headaches on Unix): a file would be deemed binary unless all listed methods declare the file as text (with an empty list disabling AutoCRLF detection).

Mark

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]