Re: [RFH] eol=lf on existing mixed line-ending files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 8, 2011 at 3:15 AM, Jeff King <peff@xxxxxxxx> wrote:
>
>  git init repo &&
>  cd repo &&
>  {
>    printf 'one\n' &&
>    printf 'two\r\n'
>  } >mixed &&
>  git add mixed &&
>  git commit -m one &&
>  echo '* eol=lf' >.gitattributes
>
> Now if we run "git status" or "git diff", it will let us know that
> "mixed" is modified, insofar as adding and committing it would perform
> the LF conversion.

Well, git _may_ report that file is modified, but usually when you
change .gitattributes, git does not notice changes to file endings
until you touch those files. You can force git to notice changes in
all files by doing:

 $ touch -d 2000-1-1 .git/index

so it will re-read all files, but I guess it should be do that
automatically, otherwise many people end up with having inconsistent
file endings in their repository as result of editing .gitattributes
(or by just pulling a new version from the upstream).

>
> Now we come to the first confusing behavior. Generally one would expect
> the working directory to be clean after a "git reset --hard". But not
> here:
>
>  git reset --hard &&
>  git status
>
> will still show "mixed" as modified.

It is because you discard all changes except to .gitattributes.  If
.gitattributes were tracked, "reset" would discard them too, and you
would get clean original state.

> So that kind of makes sense. But it isn't all that helpful, if I just
> want to reset my working tree to something sane without making a new
> commit (more on this later).

If we do not discard changes to .gitattributes then the question is
what a sane state is? It is really difficult to define what is sane
when conversion to the work tree and back gives a different result.

> But here's an extra helping of confusion on top. Every once in a while,
> doing the reset _won't_ keep "mixed" as modified. I can trigger it
> reliably by inserting an extra sleep into git:

you can have the same effect by doing:

git reset --hard HEAD && sleep 1 && git touch .git/index

Ironically, that the race that you observed is result of fixing another
race in git when files are changed too fast, so they may have the same
timestamp. To prevent this race, git checks timestamp of .git/index
and a trcking file. If .git/index timestamp is older or same as that file,
this file is considered dirty. So, it is re-read from the disk to check
if there are any changes. This works well but only if conversion to the
work tree and back produces the same result.

> So we get two different outcomes, depending on the index raciness. Which
> one is right, or is it right for it to be non-deterministic?

I like everything being deterministic, but in this case I do not see
how it is possible without making the normal case much slower.

> And one final question. Let's say I don't immediately convert this mixed
> file to the correct line-endings.

IMHO, adding .gitattributes that specifies line endings while not
fixing actual line endings of existing files is really a bad idea.

As with any other filter, the rule is that conversion from git to
the working tree and back should give the same result for any file
in the repository, otherwise you will have a lot of troubles later.

> Hopefully my example made sense and was reproducible. The real repo
> which triggered this puzzle was jquery. You can try:
>
>  git clone git://github.com/jquery/jquery.git &&
>  cd jquery &&
>  git checkout 1.4.2 &&
>  git checkout master
>
> which will fail (but may succeed racily on a slow enough machine).
> Obviously they need to fix the mixed line-ending files in their repo.
> But that fix would be on HEAD, and "git checkout 1.4.2" will be forever
> broken. Is there a way to fix that?

You cannot change the past history. Well, you can overwrite that
setting using .git/info/attributes. It does not make sense to do
that in general, but it may be useful if you do git bisect.

BTW, nowadays, we have much better alternative than using

* crlf=input

Instead of it, you probably want to use:

* text=auto

which will automatically detect text files, so you won't have problems
with binary files. All text files are put into the repository with LF,
but users may have different endings in their working tree if they like.


Dmitry
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]