Junio C Hamano <gitster@xxxxxxxxx> writes: > Jeff King <peff@xxxxxxxx> writes: > >> and so on. I haven't quite figured out what is going on. It looks like >> we call update_pre_post_images with postlen==0, which causes it to just >> write the fixed postimage into the existing buffer. But of course the >> fixed version is bigger, because we are expanding the tabs into 8 >> spaces (but it _doesn't_ break if each line starts with only one tab, >> which confuses me). > > I used to be intimately familiar with the update_pre_post_images() > function, but the version after 86c91f91794c (git apply: option to > ignore whitespace differences, 2009-08-04), I won't vouch for it > doing anything sensible. We recently had to do 5de7166d46d2 > (apply.c:update_pre_post_images(): the preimage can be truncated, > 2012-10-12) to fix one of its corner cases but I would not be > surprised if there are other cases the function gets it all wrong. > > I'd come back to the topic after I finish other tasks on my plate, > so if somebody is inclined please go ahead digging this a bit > further; I won't have much head start to begin with in this code > X-<. This may be sufficient. In the olden days, we relied on that all whitespace fixing rules made the result shorter and took advantage of it in update-pre-post-images to rewrite the images in place. The oddball tab-in-indent (aka Python), however, can grow the result by expanding tabs (deemed "incorrect") in the input into runs of spaces (deemed "kosher"). Fortunately, we already support its more generalized form "match while ignoring whitespace differences" that can lengthen the result; as long as we correctly count the number of bytes needed to hold rewritten postimage, the existing logic in update_pre_post_images should be able to do the rest for us. builtin/apply.c | 16 ++++++++++------ t/t4124-apply-ws-rule.sh | 26 ++++++++++++++++++++++++++ t/t4150-am.sh | 2 +- t/test-lib-functions.sh | 4 ++-- 4 files changed, 39 insertions(+), 9 deletions(-) diff --git a/builtin/apply.c b/builtin/apply.c index 080ce2e..cad4e4f 100644 --- a/builtin/apply.c +++ b/builtin/apply.c @@ -2117,10 +2117,10 @@ static void update_pre_post_images(struct image *preimage, /* * Adjust the common context lines in postimage. This can be - * done in-place when we are just doing whitespace fixing, - * which does not make the string grow, but needs a new buffer - * when ignoring whitespace causes the update, since in this case - * we could have e.g. tabs converted to multiple spaces. + * done in-place when we are shrinking it with whitespace + * fixing, but needs a new buffer when ignoring whitespace or + * expanding leading tabs to spaces. + * * We trust the caller to tell us if the update can be done * in place (postlen==0) or not. */ @@ -2185,7 +2185,7 @@ static int match_fragment(struct image *img, int i; char *fixed_buf, *buf, *orig, *target; struct strbuf fixed; - size_t fixed_len; + size_t fixed_len, postlen; int preimage_limit; if (preimage->nr + try_lno <= img->nr) { @@ -2335,6 +2335,7 @@ static int match_fragment(struct image *img, strbuf_init(&fixed, preimage->len + 1); orig = preimage->buf; target = img->buf + try; + postlen = 0; for (i = 0; i < preimage_limit; i++) { size_t oldlen = preimage->line[i].len; size_t tgtlen = img->line[try_lno + i].len; @@ -2362,6 +2363,7 @@ static int match_fragment(struct image *img, match = (tgtfix.len == fixed.len - fixstart && !memcmp(tgtfix.buf, fixed.buf + fixstart, fixed.len - fixstart)); + postlen += tgtfix.len; strbuf_release(&tgtfix); if (!match) @@ -2399,8 +2401,10 @@ static int match_fragment(struct image *img, * hunk match. Update the context lines in the postimage. */ fixed_buf = strbuf_detach(&fixed, &fixed_len); + if (postlen < postimage->len) + postlen = 0; update_pre_post_images(preimage, postimage, - fixed_buf, fixed_len, 0); + fixed_buf, fixed_len, postlen); return 1; unmatch_exit: diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index 6f6ee88..0bbcf06 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -486,4 +486,30 @@ test_expect_success 'same, but with CR-LF line endings && cr-at-eol unset' ' test_cmp one expect ' +test_expect_success 'whitespace=fix to expand' ' + qz_to_tab_space >preimage <<-\EOF && + QQa + QQb + QQc + ZZZZZZZZZZZZZZZZd + QQe + QQf + QQg + EOF + qz_to_tab_space >patch <<-\EOF && + diff --git a/preimage b/preimage + --- a/preimage + +++ b/preimage + @@ -1,7 +1,6 @@ + QQa + QQb + QQc + -QQd + QQe + QQf + QQg + EOF + git -c core.whitespace=tab-in-indent apply --whitespace=fix patch +' + test_done diff --git a/t/t4150-am.sh b/t/t4150-am.sh index cdafd7e..12f6b02 100755 --- a/t/t4150-am.sh +++ b/t/t4150-am.sh @@ -17,7 +17,7 @@ test_expect_success 'setup: messages' ' vero eos et accusam et justo duo dolores et ea rebum. EOF - q_to_tab <<-\EOF >>msg && + qz_to_tab_space <<-\EOF >>msg && QDuis autem vel eum iriure dolor in hendrerit in vulputate velit Qesse molestie consequat, vel illum dolore eu feugiat nulla facilisis Qat vero eros et accumsan et iusto odio dignissim qui blandit diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index fa62d01..349f3b8 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -87,8 +87,8 @@ q_to_cr () { tr Q '\015' } -q_to_tab () { - tr Q '\011' +qz_to_tab_space () { + tr QZ '\011\040' } append_cr () { -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html