Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 7, 2010 at 7:18 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> Maybe we should rethink this from the top.  Imagine that we currently
>> have no crlf options whatsoever.  What *should* it look like?  I
>> suggest the following:
>>
>> Config:
>>    core.eolOverride = lf / crlf / auto / binary / input
>>    core.eolDefault = lf / crlf / auto / binary / input
>
> Ugh. Hell no. What an ugly format. What does that crazy "override vs
> default" even _mean_?

That's easy:

 - if "override" is set, it overrides any attribute setting.
 - if "default" is set, we use it when there's no attribute or override setting.

We can argue about whether having two config options is strictly
necessary from a formal truth table point of view, and you'll probably
win the argument because it all makes my head spin.  My argument is
simpler: if it makes my head spin, it probably makes other people's
heads spin.  The way I described is simple enough for anyone to
understand.

> Plus the above is confused anyway. The only reason to ever support 'lf' is
> if you're a total moron of a SCM, and you save files you know are text in
> CRLF format internally. That's just f*cking stupid.

What I meant by "lf" is just what we currently mean by "crlf=false".
It's more clear for the average person to say "eol=lf" than
"crlf=false", because "crlf=false" doesn't say what you *do* want, it
only says what you *don't* want.

Clearly any repo storing some other weird line ending, then converting
it to LF, is not what we want here.

>  - disabling all "text" issues, and considering everything to be pure
>   binary. This is the "I know I'm sane and unix" option, or the "doing
>   any conversion is always wrong" option.
>
>   We'd call this "binary" or "off" or "false".

Sure, that's what I called "binary" above.

>  - if you recognize a text-file, and consider it text and different from
>   binary, at a _minimum_ it needs what we call "input". Anything else is
>   crazy-talk. We don't save the same text-file in different formats, and
>   we know that CRLF (or CR) is just a stupid format for text.
>
>   So there are zero options for the input side. If we don't do CRLF -> LF
>   conversion on input, it's worthless even _talking_ about text vs binary.

That sounds good to me.  So this was a mistake in the original
implementation of autocrlf; let's just correct it, and make all text
modes do input conversion.

Note that, in prior threads on this topic, there was some objection to
doing crlf=anything by default because it wastes CPU in the common
case that people are running on Unix and aren't doing screwy things
with line endings.  Defaulting to crlf=input would require us to waste
CPU here.  Is that ok?

>  - For output, there are exactly three choices: "do nothing" (aka just
>   "input", aka "LF"), output in native format (CRLF on Windows, LF on
>   UNIX), or "force CRLF" regardless of any defaults (and the last
>   probably doesn't make sense in practice, but is good for test-suites,
>   so that you can get CRLF output even on sane platforms.
>
> So I think the _only_ sane choices are basically
>
>        core.crlf=[off|input|on|force]

One nice thing about my suggestion is that it completely avoids the
concept of a "native CRLF format."  Because nowadays, that's just not
very useful.  On Unix sometimes I need crlf files; on Windows
sometimes I need lf files.  Yes, we can still implement that in terms
of "native" terminology, but it seems to a roundabout way of stating
what I want.

> And the above is basically what we have. Except that for historical
> reasons (ie we didn't even _have_ any attributes) it got mixed it up with
> "do we want to do this automatically", so "autocrlf=on" actually ends up
> being "yes, do automatic detection" _and_ what I'd call "core.crlf=force"
> above.

Functionally, yes, we have this already.  Your new proposal is
essentially to make crlf=auto (= unspecified) to actually always
include crlf=input behaviour, which sounds good to me, but may be
backwards incompatible in some important way.  (I wouldn't think
anybody would want the non-fixing-stuff behaviour.  But I wonder what
it would do to git-svn... maybe it could just check everything in as
if it were crlf=binary, if it doesn't already.)

My suggestion doesn't much change this functionality, but attempts to
straighten out the terminology so normal humans can understand what
will happen.  Not sure if that's worth it, given that we'll probably
have to support the old attribute names forever anyhow, and adding a
second set of words might confuse normal humans all the more.  But I
would much rather teach people to use it using my terminology than
crlf=true/false/binary terminology.  What does "crlf=binary" mean?

Have fun,

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]