Re: autocrlf=input and safecrlf (was Re: CVS import [SOLVED])

Jeff King <peff@xxxxxxxx> · Sun, 22 Feb 2009 19:08:40 -0500

On Sat, Feb 21, 2009 at 12:24:11AM +0100, Ferry Huberts (Pelagic) wrote:

>> I still think safecrlf could probably be made more useful in this case
>> to differentiate between "this will corrupt your data if you do a
>> checkout with your current config settings" and "this will corrupt your
>> data forever".  But I am not a user of either config variable, so maybe
>> there is some subtlety I'm missing.
>
> I'm a user of these options myself. I maintain several large repositories 
> that contain data that is used both on Unix and Windows platforms and that 
> have the autocrlf=input and safecrlf=true. This makes sure that everything 
> is in Unix format.

OK, so there is some value to that combination, then, I suppose. It
seems like there must be some easier and more obvious way to say "reject
all CRLFs", but I can't think of one besides setting up a hook (which
would work at commit time, not add time).

> Your remark about corrupting your data is a bit strong for my taste.  
> Corruption from one point of view, making sure that everybody handles the 
> same content from another :-)

I'm not sure you understood what I meant. What I meant is that for some
set of data, applying CRLF->LF conversion is lossy, and will permanently
destroy the ability to restore the original data. For example, arbitrary
binary data which contains both CRLF and LF will have all CRLF become
LF, but you don't know which of the resulting LFs were originally CRLFs,
and which were just LFs. The data is corrupted, there is no way to get
back the original, and this is what CRLF is about protecting.

However, that safecrlf check is implemented by saying "with the current
autocrlf settings, would checkin and checkout get the same file?". In
the case of autocrlf=true, that that exactly prevents the data above
from being corrupted. But with autocrlf=input, it prevents _any_ CRs
from being converted, since checkout will not convert them back. So even
though your data is not irretrievable (the transformation _is_
reversible, you just don't have it enabled), safecrlf is still
triggering and refusing the content.

And I was suggesting that it might be useful to distinguish between
those two situations. Because right now, with autocrlf=input you have
two choices:

  - safecrlf=false, in which you will corrupt mixed CRLF/LF data without
    any warning

  - safecrlf=true, in which case you are not allowed to check in CR at
    all

But there is no choice for "protect me from actual corruption, but
convert text files (i.e., all CRLF)".

I am a bit concerned about a proposal to set safecrlf=false in all
cvsimported repositories.  You are turning off the protection against
corrupting binary files.  _Even if_ the person has put safecrlf=true
into their ~/.gitconfig and thinks they are safe.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html