Am 05.09.22 um 10:01 schrieb Ævar Arnfjörð Bjarmason: > > On Sun, Sep 04 2022, Matheus Tavares wrote: > >> When applying a patch, `git am` looks for special delimiter strings >> (such as "---") to know where the message ends and the actual diff >> starts. If one of these strings appears in the commit message itself, >> `am` might get confused and fail to apply the patch properly. This has >> already caused inconveniences in the past [1][2]. To help avoid such >> problem, let's make `git format-patch` warn on commit messages >> containing one of the said strings. >> >> [1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@xxxxxxxxxx/ >> [2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/ > > I followed this topic with one eye, and have run into this myself in the > past. I'm not against this warning, but I wonder if we can't fix > "am/apply" to just be smarter. The cases I've seen are all ones where: > > * We have a copy/pasted git diff, but we could disambiguate based on > (at least) the "---" line being a telltale for the "real" patch, and > the "X file changed..." diffstat. > * We have a not-quite-git-looking patch diff in the commit message > (which we'd normally detect and apply), as in your [2]. > > Couldn't we just be a bit smarter about applying these, and do a > look-ahead and find what the user meant. Whatever we use to separate message from diff can be included in that message by an unsuspecting user and "---" can be part of a diff. An earlier discussion yielded an idea, but no implementation: https://lore.kernel.org/git/20200204010524-mutt-send-email-mst@xxxxxxxxxx/ > Is any case, having such a warning won't "settle" this issue, as we're > able to deal with this non-ambiguity in commit objects/the push/fetch > protocol. It's just "format-patch/am" as a "wire protocol" that has this > issue. > > But anyway, that's the state of the world now, so warning() about it is > fair, even if we had a fix for the "apply" part we might want to warn > for a while to note that it's an issue on older gits. > >> + if (pp->check_in_body_patch_breaks) { >> + strbuf_reset(&linebuf); >> + strbuf_add(&linebuf, line, linelen); >> + if (patchbreak(&linebuf) || is_scissors_line(linebuf.buf)) { >> + strbuf_strip_suffix(&linebuf, "\n"); > > Hrm, it's a (small) shame that the patchbreak() function takes a "struct > strbuf" rather than a char */size_t in this case (seemingly for no good > reason, as it's "const"?). A strbuf is NUL-terminated, a length-limited string (char */size_t) doesn't have to be. That means the current implementation can use functions like starts_with(), but a faithful version that promises to stay within a given length cannot. So the reason is probably convenience. With skip_prefix_mem() it wouldn't be that bad, though: --- mailinfo.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/mailinfo.c b/mailinfo.c index 9621ba62a3..ae2e70e363 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -646,32 +646,30 @@ static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line) free(ret); } -static inline int patchbreak(const struct strbuf *line) +static int patchbreak(const char *buf, size_t len) { - size_t i; - /* Beginning of a "diff -" header? */ - if (starts_with(line->buf, "diff -")) + if (skip_prefix_mem(buf, len, "diff -", &buf, &len)) return 1; /* CVS "Index: " line? */ - if (starts_with(line->buf, "Index: ")) + if (skip_prefix_mem(buf, len, "Index: ", &buf, &len)) return 1; /* * "--- <filename>" starts patches without headers * "---<sp>*" is a manual separator */ - if (line->len < 4) + if (len < 4) return 0; - if (starts_with(line->buf, "---")) { + if (skip_prefix_mem(buf, len, "---", &buf, &len)) { /* space followed by a filename? */ - if (line->buf[3] == ' ' && !isspace(line->buf[4])) + if (len > 1 && buf[0] == ' ' && !isspace(buf[1])) return 1; /* Just whitespace? */ - for (i = 3; i < line->len; i++) { - unsigned char c = line->buf[i]; + for (; len; buf++, len--) { + unsigned char c = buf[0]; if (c == '\n') return 1; if (!isspace(c)) @@ -682,14 +680,14 @@ static inline int patchbreak(const struct strbuf *line) return 0; } -static int is_scissors_line(const char *line) +static int is_scissors_line(const char *line, size_t len) { const char *c; int scissors = 0, gap = 0; const char *first_nonblank = NULL, *last_nonblank = NULL; int visible, perforation = 0, in_perforation = 0; - for (c = line; *c; c++) { + for (c = line; len; c++, len--) { if (isspace(*c)) { if (in_perforation) { perforation++; @@ -705,12 +703,14 @@ static int is_scissors_line(const char *line) perforation++; continue; } - if (starts_with(c, ">8") || starts_with(c, "8<") || - starts_with(c, ">%") || starts_with(c, "%<")) { + if (skip_prefix_mem(c, len, ">8", &c, &len) || + skip_prefix_mem(c, len, "8<", &c, &len) || + skip_prefix_mem(c, len, ">%", &c, &len) || + skip_prefix_mem(c, len, "%<", &c, &len)) { in_perforation = 1; perforation += 2; scissors += 2; - c++; + c--, len++; continue; } in_perforation = 0; @@ -747,7 +747,8 @@ static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line) { if (mi->inbody_header_accum.len && (line->buf[0] == ' ' || line->buf[0] == '\t')) { - if (mi->use_scissors && is_scissors_line(line->buf)) { + if (mi->use_scissors && + is_scissors_line(line->buf, line->len)) { /* * This is a scissors line; do not consider this line * as a header continuation line. @@ -808,7 +809,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) if (convert_to_utf8(mi, line, mi->charset.buf)) return 0; /* mi->input_error already set */ - if (mi->use_scissors && is_scissors_line(line->buf)) { + if (mi->use_scissors && is_scissors_line(line->buf, line->len)) { int i; strbuf_setlen(&mi->log_message, 0); @@ -826,7 +827,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) return 0; } - if (patchbreak(line)) { + if (patchbreak(line->buf, line->len)) { if (mi->message_id) strbuf_addf(&mi->log_message, "Message-Id: %s\n", mi->message_id); -- 2.37.2