tboegi@xxxxxx writes: > From: Torsten Bögershausen <tboegi@xxxxxx> > > When a file had been commited with CRLF and core.autocrlf is true, > the following does not roundtrip, `git apply` fails: > > printf "Added line\r\n" >>file && > git diff >patch && > git checkout -- . && > git apply patch Should this tweak be in effect when we are paying attention to the contents of the index? Or does it matter only when we are doing "git apply" without either "--index" or "--cache"? The fact that the new flag that is passed from load_preimage() down to read_old_data(), the latter of which is only about the "behave as a better 'patch -p1', ignoring the index" mode, hints that this should not affect anything when we are paying attention to the index, but then calling the load_patch_target() helper with a very generic CR_AT_EOL flag looks a bit misleading, by making it appear as if the caller is telling all three cases to do the funny CR business. I wonder if we instead want to pass the "patch" structure down the callchain and have _only_ read_old_data() pay attention to the has_crlf bit in it---that way, it becomes more obvious what is going on. I also notice that read_old_data() still passes &the_index to convert_to_git(). Can we fix convert_to_git() further to allow NULL as the istate parameter when we tell it not to look at the index ( I am reading your passing SAFE_CRLF_KEEP and SAFE_CRLF_FALSE as a way for the caller to tell that the caller knows what it wants already and does not want covnert_to_git() to look at the index)? With or without CR in the patch, I do not see any reason for the codepath from read_old_data() down to the convert API to look at the index, ever, as read_old_data() is called only when (!state->cached && !state->check_index), i.e. when we are working as a better 'patch -p1' without even having to know that we are in a Git repository, by load_patch_target(). > diff --git a/apply.c b/apply.c > index f2d599141d..63455cd65f 100644 > --- a/apply.c > +++ b/apply.c > @@ -220,6 +220,7 @@ struct patch { > unsigned int recount:1; > unsigned int conflicted_threeway:1; > unsigned int direct_to_threeway:1; > + unsigned int has_crlf:1; > struct fragment *fragments; > char *result; > size_t resultsize; > @@ -1662,6 +1663,17 @@ static void check_whitespace(struct apply_state *state, > record_ws_error(state, result, line + 1, len - 2, state->linenr); > } > > +/* Check if the patch has context lines with CRLF or > + the patch wants to remove lines with CRLF */ Style. > +static void check_old_for_crlf(struct patch *patch, const char *line, int len) > +{ > + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { > + patch->ws_rule |= WS_CR_AT_EOL; > + patch->has_crlf = 1; > + } > +} I am wondering where the discrepancy between the names (old crlf here, as opposed to "struct patch" that says "has_crlf" implying it does not care which side between old and new has one) comes from. Is the intention that you only care about what is expected of the preimage? > @@ -1712,11 +1724,13 @@ static int parse_fragment(struct apply_state *state, > if (!deleted && !added) > leading++; > trailing++; > + check_old_for_crlf(patch, line, len); > if (!state->apply_in_reverse && > state->ws_error_action == correct_ws_error) > check_whitespace(state, line, len, patch->ws_rule); > break; If check_old is about "we care about the lines that are supposed to match what currently exist in the target file to be patched", then the call to it also must pay attention to apply_in_reverse. What appears on '+' lines in a patch will become the "expected old contents the patched target is supposed to have" when we are running "apply -R". > case '-': > + check_old_for_crlf(patch, line, len); > if (state->apply_in_reverse && > state->ws_error_action != nowarn_ws_error) > check_whitespace(state, line, len, patch->ws_rule); Likewise. Thanks for working on this; unlike the previous one I commented, I think this one I can sort of see how this is an approach to fix it.