On 2016-05-02 21.33, Junio C Hamano wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > Let's step back a bit and make sure we are on the same page. I > think this "series" conflates a bit too many things into a single > topic. > > * The comparison between the index and the working tree, i.e. "git > diff", should compare result of convert_to_git() with what is in > the index, and the world around it should be made consistent with > that. Your separate "git blame" fix to add missing knowledge > that convert_to_git() would not do s/CRLF/LF/ for a path whose > index entry already is contaminated with CR falls into this > category and was a very good thing to do. > > * A convert_to_git() and convert_to_working_tree() pair that do not > roundtrip would (by definition) leave contents in the working > tree, that, when passed through convert_to_git(), will be > different from the index, upon completion of "reset --hard". We > _should_ fix it so that "git diff" _reports_ differences. > Currently, lstat(2) based optimization hides this in a racy way > (when racy Git kicks in to reinspect the index and the working > tree file actually matches, it finds out that they do not match), > it is a bug that needs to be fixed, not 10/10 where it tries to > hide the differences consistently and spreads the bug. I haven't > studied 8/10 carefully yet, but it seems to attempt the same. > > * I think the "text=auto eol=THIS" that did not mean "I do not care > to specify which ones are text files. Please detect the file > type, and for those automatically detected, please make sure that > the contents follwo THIS eol convention." was a bug, and what > 07/10 tries to do is a good thing. > > By the way, lack of the cover letter of this series made it more > painful to write a reply like this than necessary. A cover letter > for a trivial 3-patch series might be overkill, but for anything > with substance that spans more than 4-5 patches, a cover letter to > describe the overall direction would really help. The 10/10 needs to be replaced with something different, and I start to get a better picture, why. read_cache.c/ce_compare_data() checks what "git add" will change in the index. sha1_file.c/index_fd() will read the file content from the working tree, run convert_to_git() and calculate the sha1, which read_cache.c feeds into hashcmp(). When convert_to_git() is run, 3 steps are taken: - apply the filter, if any - run crlf_to_git(), if attributes and core.autocrlf say so - run ident, if specified. Now the crlf_to_git() uses has_cr_in_index(const char *path) to find out if we should keep the CRLF or not. Side note: the suggested patches will use get_convert_stats_sha1(get_sha1_from_cache(path),) This works pretty well under normal "git add", but fails in t6038, when we do a merge. The new function get_sha1_from_index() can not find the sha1, "read-cache.c/get_sha1_from_index:2392 path=file pos=-3" and falls into the code path for /* * We might be in the middle of a merge, in which * case we would read stage #2 (ours). */ read-cache.c/get_sha1_from_index:2408 path=file pos=3 read-cache.c/get_sha1_from_index:2416 path=file sha1=ad55e2 (The line number are with debug code) The problem is, that ad55e2 is _not_ the sha1, which read_cache/ce_compare_data() had been asked to look at. Instead we should check the blob with 99b633. The result is that convert/get_convert_stats_sha1() is called on the wrong blob (the one with CRLF instead the one without CRLF), and this makes t6038 from 9/10 fail. 10/10 rescues the situation, by using the correct blob :-) In short, ce_compare_data() needs to forward ce->sha1 the whole way into convert.c/crlf_to_git() and get_convert_stats_sha1(). While at it, I realized that we call a convert_attrs(&ca, path) a couple of times (e.g. in would_convert_to_git(), to find out that we really don't have any attributes set. It could be nice to do that only once. The next step will be to add the improvements/fixes for the ce_compare_data() chain as described above, and then put 7/10..9/10 on top of that. This will probably take some time, so that's why I asked if 1/10..4/10 could proceed as is ? (and the next version with cover letter, sorry for that) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html