At this point, the index records a blob with LF line ending,
while the work tree file has the same content with CRLF line
ending.
I think this needs more than just sleeping on.
There are two separate problems related to crlf treatment in git that
manifest themselves in the quirks you see in the current implementation:
(1) The fact that the index may be misaligned with the work tree. Junio's
example demonstrates this well. I have resorted to
$ rm -rf *
$ git reset --hard
in the past to get a work tree that passes
$ git status
without false positives after changing the value of autocrlf.
(2) The fact that repository content may be mangled in an indeterminate way
because of the current work tree <-> repository transformation algorithm.
While criticism in the past has mainly been levelled at not knowing whether
a truly binary file will be correctly determined as such, content can be
lost in the round trip work tree -> repository -> work tree much more
simply:
$ git init
$ git config core.autocrlf true
$ echo ab | tr ab \\r\\n >a.txt
$ od -t a a.txt
0000000 cr nl nl
0000003
$ git add a.txt
$ git commit
$ rm a.txt
$ git reset --hard
$ od -t a a.txt
0000000 cr nl cr nl
0000004
In summary, it irks me that autocrlf true mode is a second cousin of
autocrlf false and I think that there *should* be an acceptable
deterministic solution to this.
The solution to (2) seems easier than (1): could the transformation
algorithm be made deterministic and changed to something like "convert all
crlf pairs to lf if and only if no singleton cr or lf exist in the file
before conversion"? If a binary file gets mangled in error, it would be an
easy transformation with standard tools to get the file back again. If an
otherwise text file has mixed lf and crlf endings, or additional cr or lf
sprinkled randomly through it, the file is not transformed.
Given a deterministic transformation algorithm, the solution to (1) boils
down to recording for each file in the work tree whether the transformation
algorithm was used or not in arriving at the file's current contents,
together with a way of telling git to force the use of the transformation
algorithm or not for a particular file. It seems to me the place that this
information *should* be recorded is the index, given that both .git/config
and .gitattributes can be changed independently of the work tree. Recording
the information in the index would mean that both autocrlf true and autocrlf
false clones of the same repository would produce equally valid work trees
with no loss of information. I am however not well versed enough in git
internals at the moment to know whether this is an acceptable solution or
not.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html