Re: Git EOL Normalization

Dmitry Potapov <dpotapov@xxxxxxxxx> · Wed, 25 May 2011 21:58:33 +0400

On Wed, May 25, 2011 at 7:20 PM, Stephen Bash <bash@xxxxxxxxxxx> wrote:
>
> The open questions for me are:
>  1) what is the actual text file detection algorithm?
>  2) what is the autocrlf LF/CRLF detection algorithm?
>  3) how does autocrlf handle mixed line endings? (either in the working copy or repo)

Git looks at the text attribute of a file. If it is set or unset then it
treats the file as text or binary accordingly. If the text attribute is
'auto', or it is unspecified but core.autocrlf is true, then git uses
heuristics to detect text files.

Currently, the following heuristics are used:

A file is considered as text if it does not have '\0' or a bare CR, and
the number of non-printable characters is less than 1 in 128.

Non-printable characters are DEL (127) and anything less than 32 except
CR, LF, BS, HT, ESC and FF.

Also, to avoid problems with autocrlf=true when someone has already put
a text file with CRLF, CRLF->LF conversion happens only if the tracked
file in the index does not have any CR.

Dmitry
PS I wrote this mostly from my memory, so I could miss some detail.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html