Thomas Gummerer <t.gummerer@xxxxxxxxx> writes: > On 08/08, Junio C Hamano wrote: >> ... >> Let me ask the same question in a more direct way. Which part of >> this test break with your series? >> >> test_expect_success 'git add --refresh with pathspec' ' >> git reset --hard && >> echo >foo && echo >bar && echo >baz && >> git add foo bar baz && H=$(git rev-parse :foo) && git rm -f foo && >> echo "100644 $H 3 foo" | git update-index --index-info && >> # "sleep 1 &&" in the update here ... >> test-chmtime -60 bar baz && >> >expect && >> git add --refresh bar >actual && >> test_cmp expect actual && >> >> git diff-files --name-only >actual && >> ! grep bar actual&& >> grep baz actual >> ' >> >> We prepare an index with bunch of paths, we make "foo" unmerged, we >> smudge bar and baz stat-dirty, so that "diff-files" would report >> them, even though their contents match what is recorded in the >> index. > > After getting confused a bit myself, I now think here is the problem. > The v5 code smudges baz when doing git add --refresh bar. Therefore > baz isn't considered stat-dirty by the code, but a racily smudged entry > and therefore its content gets checked, thus not showing up in > git diff-files. So in short, the breakage is the last one among the three choices I gave you in my message you are responding to. The user asked to refresh "bar" so that later diff-files won't report a false change on it, but "baz" effectively ends up getting refreshed at the same time and a false change is not reported. That "breakage" is, from the correctness point of view, not a breakage. As the primary purpose of "refreshing" is to support commands that want to rely on a quick ce_modified() call to tell files that are modified in the working tree since it was last added to the index---you refresh once, and then you call such commands many times without having to worry about having to compare the contents between the indexed objects and the working tree files. But from the performance point of view, which is the whole point of "refresh", the behaviour of the new code is dubious. If the user is working in a large working tree (which automatically means large index, the primary reason we are doing this v5 experiment), the user often is working in a deep and narrow subdirectory of it, and a path limited refresh (the test names a specific file "bar", but imagine it were "." to limit it to the directory the user is working in) may be a cheap way not to bother even checking outside the area the user currently is working in. Also, smudging more entries than necessary to be checked by ce_modified_check_fs() later at runtime may mean that it defeats the "refresh once and then compare cheaply many times" pattern that is employed by existing scripts. Is the root cause really where the "racily-clean so smudge to tell later runtime to check contents" bit goes? I am hoping that the issue is not coming from the difference between the current code and your code when they decide to "smudge", what entries they decide to "smudge" and based on what condition. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html