On Sat, Apr 09, 2011 at 10:58:59PM +0400, Dmitry Potapov wrote: > > Now we come to the first confusing behavior. Generally one would expect > > the working directory to be clean after a "git reset --hard". But not > > here: > > > > Âgit reset --hard && > > Âgit status > > > > will still show "mixed" as modified. > > It is because you discard all changes except to .gitattributes. If > .gitattributes were tracked, "reset" would discard them too, and you > would get clean original state. Yeah, in this case. But gitattributes could easily be in the repository already, and reset still wouldn't change it (as it is in the jquery example). > > So that kind of makes sense. But it isn't all that helpful, if I just > > want to reset my working tree to something sane without making a new > > commit (more on this later). > > If we do not discard changes to .gitattributes then the question is > what a sane state is? It is really difficult to define what is sane > when conversion to the work tree and back gives a different result. Agreed. The problem is the disconnect between what is in the repository, and what _would_ be in the repository if we committed the file. So obviously what the user is giving to git in this case is slightly insane. I just wonder if git can do better. But the only options I could think of are: 1. Set the working tree file to have just LF's. But that doesn't help, since it is the conversion _to_ linefeeds that make it look like the file is changed. So we'd still see unstaged changes. 2. Set the index file to have just LF's. That would make the working tree look clean, but it would look like changes are staged, which is even worse. > > But here's an extra helping of confusion on top. Every once in a while, > > doing the reset _won't_ keep "mixed" as modified. I can trigger it > > reliably by inserting an extra sleep into git: > > you can have the same effect by doing: > > git reset --hard HEAD && sleep 1 && git touch .git/index Yeah, that has the same effect. I wanted to show the sleep inside git to demonstrate that it really is an inside-git race condition. > Ironically, that the race that you observed is result of fixing another > race in git when files are changed too fast, so they may have the same > timestamp. To prevent this race, git checks timestamp of .git/index > and a trcking file. If .git/index timestamp is older or same as that file, > this file is considered dirty. So, it is re-read from the disk to check > if there are any changes. This works well but only if conversion to the > work tree and back produces the same result. Yeah, that's my analysis, too. > > So we get two different outcomes, depending on the index raciness. Which > > one is right, or is it right for it to be non-deterministic? > > I like everything being deterministic, but in this case I do not see > how it is possible without making the normal case much slower. I think if you took my (1) suggestion above, it would be deterministic. I don't know how much that would help. It would at least force people to always see the change and hopefully spur them to commit the fixed line-endings. > > And one final question. Let's say I don't immediately convert this mixed > > file to the correct line-endings. > > IMHO, adding .gitattributes that specifies line endings while not > fixing actual line endings of existing files is really a bad idea. I absolutely agree, and my first advice upon seeing this jquery repo was to fix those line endings. But they went for over a year with the broken setup, so clearly it wasn't bothering them. I wonder what git could do better to provoke them to fix it sooner. > As with any other filter, the rule is that conversion from git to > the working tree and back should give the same result for any file > in the repository, otherwise you will have a lot of troubles later. I think that's a good rule in general, but doesn't crlf=input (and now eol=lf, and by extension, the text attribute) encourage exactly that if you have mixed line-ending files? I think the moral of the story may simply be that mixed line-ending text files are an abomination which should be rooted out and destroyed. > > Âgit clone git://github.com/jquery/jquery.git && > > Âcd jquery && > > Âgit checkout 1.4.2 && > > Âgit checkout master > > > > which will fail (but may succeed racily on a slow enough machine). > > Obviously they need to fix the mixed line-ending files in their repo. > > But that fix would be on HEAD, and "git checkout 1.4.2" will be forever > > broken. Is there a way to fix that? > > You cannot change the past history. Well, you can overwrite that > setting using .git/info/attributes. It does not make sense to do > that in general, but it may be useful if you do git bisect. The problem with that is that for recent commits you want one set of attributes (where the files have been fixed), and for going back to older commits, you want a different set of attributes (where you say "don't care about line endings in these files"). One solution would be to have a git-notes ref with per-commit attributes, so you could selectively override attributes as you explore history. > BTW, nowadays, we have much better alternative than using > > * crlf=input > > Instead of it, you probably want to use: > > * text=auto Agreed, and I already recommended that to jquery people (actually, one of the problem files you will see in the example above is a binary file, though later on they ended up fixing its attributes by specifically marking its extension as binary). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html