Clemens Buchacher <drizzd@xxxxxx> writes: > Ok, then let's take a step back. I do not actually care if git diff and > friends say the worktree is clean or not. You may not, but many existing scripts people have do. > But I know that I did not make > any modifications to the worktree, because I just did git reset --hard. > And now I want to use commands like cherry-pick and checkout without > failure. But they can fail, because they essentially use git diff to > check if there are worktree changes, and if so refuse to overwrite them. Yes, exactly. > So, if the check "Am I allowed to modify the worktree file?", would go > the extra mile to also check if the worktree is clean in the sense that > convert_to_worktree(index state) matches the worktree. If this is the > case, then it is safe to modify the file because it is the committed > state, and can be recovered. So in essense, the proposed "fix" is "let's fix it in the right way"? The way we defined "would we lose some changes that are only in the working tree?", aka "is the working tree dirty wrt the index?", has been to check if "git add -u" would change the states in the index. And for scripted Porcelains and end-user scripts, "git diff-files", aka "what change would 'git add -u' make to the states in the index?", has been the command to do the same check. Your proposal is to redefine "is the working tree dirty?"; it would check if "git checkout -f" would change what is in the working tree. I agree that indeed is "would we lose some changes that are only in the working tree", and I think we can do that transparently for "internal" commands, i.e. without any end-user impact, as the new check would behave identically when they have sane contents--the difference between the current check and the new check only exists when the contents in the index contradicts what the user specifies for to-git conversion via eol or clean filter. We would need a way for our scripted Porcelains and end-user scripts to ask that new question, though, but I think that is not something insurmountable. A new option to "diff-files" or something, perhaps, would be workable, but having a new "git require-clean-work-tree" plumbing, which would replace require_clean_work_tree shell helper in git-sh-setup, may be conceptually much cleaner, because the new definition of "working tree being clean" is no longer tied to what "diff" should say. I like that as a general direction. > Regarding performance impact: We only need to do this extra check if the > usual check convert_to_git(work tree) against index state fails, and > conversion is in effect. How would you detect the failure, though? Having contents in the index that contradicts the attributes and eol settings affects the cleanliness both ways. Running the working tree contents via to-git conversion and hashing may match the blob in the index, declaring that the index entry is "clean", but passing the blob to to-worktree conversion may produce result different from what is in the worktree, which is "falsely clean". That is an equally important case that is opposite from what we have been primarily discussing, which is "falsely dirty". >> Besides, I do not think the above approach really solves the issue, >> either. After "git reset --hard" to have the contents in the index >> dumped to the working tree, if your core.autocrlf is flipped, > > Indeed, if the user configuration changes, then we cannot always detect > this (at least if the filter is an external program, and the behavior of > that changes). But the user is in control of that, and we can document > this limitation. That argument does not result in a very useful result, though. Because the user is in control of what attributes and eol settings are in effect in her repository, we can just document that the current check will give unspecified result if the indexed contents contradict with that setting, e.g. when you have CRLF encoded data in the index but the eol conversion assumes LF in the repository. But this discussion is an attempt to do better than that, no? > On the other hand, a user who simply follows an upstream repository by > doing git pull all the time, and who does not make changes to their > configuration, can still run into this issue, because upstream could > change .gitattributes. This part we could actually detect by hashing the > attributes for each index entry, and if that changes we re-evaluate the > file state. If this has to bloat each index entry, I do not think solving the problem is worth that cost of that overhead. I'd rather just say "if you have inconsistent data, here is a workaround using 'reset' and then 'reset --hard'" and be done with it. > This is also an issue only if a smudge filter is in place. The eol > conversion which only acts in the convert_to_git direction is not > affected. IIRC, autocrlf=true would strip CR at the end of line in to-git conversion, and would add CR in to-worktree conversion. So some eol conversion may only act in to-git, but some others do affect both, and without needing you to touch attributes. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html