Clemens Buchacher <drizzd@xxxxxx> writes: > Coming back to "[PATCH] optionally disable gitattributes": The topics > are related, because they both deal with the situation where the work > tree has files which are not normalized according to gitattributes. But > my patch is more about saying: ok, I know I may have files which need to > be normalized, but I want to ignore this issue for now. Please disable > gitattributes for now, because I want to work with the files as they are > committed. Conversely, the discussion here is about how to reliably > detect and fix files which are not normalized. I primarily wanted to make sure that you understood the underlying issue, so that I do not have to go back to the basics in the other thread. And it is clear that you obviously do, which is good. Here, you seem to think that what t0025 wants to see happen is sensible, judging by the fact that you call "rm .git/index && git reset" a "fix". My take on this is quite different. After a "reset --hard HEAD", we should be able to trust the cached stat information and have "diff HEAD" say "no changes". That is what you essentially want in the other thread, if I understand you correctly, and in an ideal world where the filesystem timestamp has infinite precision, that is what would happen in t0025, always "breaking" its expectation. The real world has much coarser timestamp granularity than ideal, and that is why the test appear to be "flaky", failing to give "correct" outcome some of the time--but I'd say that it is expecting a wrong thing. An index entry that has data that does not round-trip when it goes through convert_to_working_tree() and then convert_to_git() "breaks" this arrangement, and I'd view it as the user having an inconsistent data. It is like you are in a repository that still has an unmerged paths--you cannot proceed before you resolve them. Anyway. As to your patch in the other thread, here is what I think: (1) When you know (or perhaps your CI knows) that the working tree has never been modified since you did "reset --hard HEAD" (or its equivalent, like "git checkout $branch" from a clean state), these paths with inconsistent data would break the usual check to ask "is the working tree clean?" That is a problem and we need a way to ensure that the working tree is always judged to be clean immediately after "reset --hard HEAD". IOW, I agree with you that the issue you are trying to solve is worth solving. (2) Regardless of the "inconsistent data breaking the cleanliness check" issue, it may be handy to have a way to temporarily disable the attributes, i.e. allow us to ask "what happens if there is no attributes defined?" IOW, I am saying that the change in the patch is not without merit. In addition to (1), I further think that this sequence should not report that the path F is modified: # Write F from HEAD to the working tree, after passing it # through convert_to_working_tree() $ git reset --hard HEAD # Force the re-reading, without changing the contents at all $ cp F F.new $ mv F.new F $ git diff HEAD which is broken by paths with inconsistent data. Your CI would want a way to make that happen. However, I do not think disabling attributes (i.e. (2)) is a solution to the issue (i.e. (1)), which we just agreed to be an issue that is worth solving, for at least two reasons. * Even without any attributes, core.autocrlf setting can get the data in your index (whose lines can be terminated with CRLF) into the same "inconsistent data" situation. Disabling attribute handling would not have any effect on that codepath, I think. * The indexed data and the contents in the working tree file may match only because the clean/smudge transformation is done. If you disable attributes, re-checking by passing the working tree contents through convert_to_git() and comparing the result with what is in the index would tell you that they are different, even if the clean/smudge filter pair implements round-trip operations correctly. One way to solve (1) I can think of is to change the definition of ce_compare_data(), which is called by the code that does not trust the cached stat data (including but not limited to the Racy Git codepath). The current semantics of that function asks this question: We do not know if the working tree file and the indexed data match. Let's see if "git add" of that path would record the data that is identical to what is in the index. This definition was cast in stone by 29e4d363 (Racy GIT, 2005-12-20) and has been with us since Git v1.0.0. But that does not have to be the only sensible definition of this check. I wonder what would break if we ask this question instead: We do not know if the working tree file and the indexed data match. Let's see if "git checkout" of that path would leave the same data as what currently is in the working tree file. If we did this, "reset --hard HEAD" followed by "diff HEAD" will by definition always report "is clean" as long as nobody changes files in the working tree, even with the inconsistent data in the index. This still requires that convert_to_working_tree(), i.e. your smudge filter, is deterministic, though, but I think that is a sensible assumption for sane people, even for those with inconsistent data in the index. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html