Re: [PATCH] Re: Teach mailinfo to ignore everything before -- >8 -- mark

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The 24/08/09, Junio C Hamano wrote:

>                                                                    perhaps                                                                                                
> we should tighten the rules a bit,

<...>

> I think we have bikeshedded long enough, so I won't be touching this code
> any further only to change the definition of what a scissors mark looks
> like,

I'm not sure I understand. Are you still open to a patch touching this code
/too/?

Anyway, here's what I wrote based on your last round in pu. I've change the
rules to something because I think we'd rather simple â?? and "easy" to 
explain to the end-user â?? rules over "obfuscated" ones.

-- >8 -- squashable to 8683eeb (ogirin/pu) -- >8 --

Subject: Teach mailinfo to ignore everything before a scissors line

This teaches mailinfo the scissors mark (e.g. "-- >8 --");
the command ignores everything before it in the message body.

For lefties among us, we also support -- 8< -- ;-)

We can skip this check using the "--ignore-scissors" option on both
the git-mailinfo and the git-am command line. This is necessary
because the stripped message may be either

  interesting from the eyes of the maintainer, regardless what the
  author think;

or

  the scissors line check is a false positive.


Basically, the rules are:

(1) a scissors mark:

  - must be 8 characters long;
  - must have a dash;
  - must have either ">8" or "<8";
  - may contain spaces.

(2) a scissors line:

  - must have only one scissors mark;
  or
  - must have any comment between two identical scissors marks;
  - always ignore spaces outside the scissors marks.


Signed-off-by: Nicolas Sebrecht <nicolas.s.dev@xxxxxx>
---
 Documentation/git-am.txt       |   14 +++++-
 Documentation/git-mailinfo.txt |    7 ++-
 builtin-mailinfo.c             |  103 +++++++++++++++++++++++----------------
 git-am.sh                      |   14 ++++-
 4 files changed, 90 insertions(+), 48 deletions(-)

diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
index fcacc94..2773a3e 100644
--- a/Documentation/git-am.txt
+++ b/Documentation/git-am.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 	 [--3way] [--interactive] [--committer-date-is-author-date]
 	 [--ignore-date] [--ignore-space-change | --ignore-whitespace]
 	 [--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>]
-	 [--reject] [-q | --quiet]
+	 [--reject] [-q | --quiet] [--ignore-scissors]
 	 [<mbox> | <Maildir>...]
 'git am' (--skip | --resolved | --abort)
 
@@ -118,6 +118,14 @@ default.   You can use `--no-utf8` to override this.
 --abort::
 	Restore the original branch and abort the patching operation.
 
+--ignore-scissors::
+	Do not check for scissors line in the commit message.  A scissors
+	line consists of a scissors mark which must be at least 8
+	characters long and which must contain dashes '-' and a scissors
+	(either ">8" or "<8").  Spaces are also permited inside the mark.
+	To add a comment on this line, it must be embedded between two
+	identical marks (e.g. "-- >8 -- squashme to <commit> -- >8 --").
+
 DISCUSSION
 ----------
 
@@ -131,7 +139,9 @@ commit is about in one line of text.
 "From: " and "Subject: " lines starting the body (the rest of the
 message after the blank line terminating the RFC2822 headers)
 override the respective commit author name and title values taken
-from the headers.
+from the headers. These lines immediatly following a scissors line
+override the respective fields regardless what could stand at the
+beginning of the body.
 
 The commit message is formed by the title taken from the
 "Subject: ", a blank line and the body of the message up to
diff --git a/Documentation/git-mailinfo.txt b/Documentation/git-mailinfo.txt
index 8d95aaa..e16a577 100644
--- a/Documentation/git-mailinfo.txt
+++ b/Documentation/git-mailinfo.txt
@@ -8,7 +8,8 @@ git-mailinfo - Extracts patch and authorship from a single e-mail message
 
 SYNOPSIS
 --------
-'git mailinfo' [-k] [-u | --encoding=<encoding> | -n] <msg> <patch>
+'git mailinfo' [-k] [-u | --encoding=<encoding> | -n] [--ignore-scissors]
+<msg> <patch>
 
 
 DESCRIPTION
@@ -49,6 +50,10 @@ conversion, even with this flag.
 -n::
 	Disable all charset re-coding of the metadata.
 
+--ignore-scissors::
+	Do not check for a scissors line (see linkgit:git-am[1]
+	for more information on scissors lines).
+
 <msg>::
 	The commit log message extracted from e-mail, usually
 	except the title line which comes from e-mail Subject.
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 7e09b51..92319f6 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -6,6 +6,7 @@
 #include "builtin.h"
 #include "utf8.h"
 #include "strbuf.h"
+#include "git-compat-util.h"
 
 static FILE *cmitmsg, *patchfile, *fin, *fout;
 
@@ -25,6 +26,7 @@ static enum  {
 static struct strbuf charset = STRBUF_INIT;
 static int patch_lines;
 static struct strbuf **p_hdr_data, **s_hdr_data;
+static int ignore_scissors = 0;
 
 #define MAX_HDR_PARSED 10
 #define MAX_BOUNDARIES 5
@@ -715,51 +717,63 @@ static inline int patchbreak(const struct strbuf *line)
 static int is_scissors_line(const struct strbuf *line)
 {
 	size_t i, len = line->len;
-	int scissors = 0, gap = 0;
-	int first_nonblank = -1;
-	int last_nonblank = 0, visible, perforation, in_perforation = 0;
 	const char *buf = line->buf;
+	size_t mark_start = 0, mark_end = 0, mark_len;
+	int scissors_dashes_seen = 0;
 
 	for (i = 0; i < len; i++) {
 		if (isspace(buf[i])) {
-			if (in_perforation) {
-				perforation++;
-				gap++;
-			}
+			if (scissors_dashes_seen)
+				mark_end = i;
 			continue;
 		}
-		last_nonblank = i;
-		if (first_nonblank < 0)
-			first_nonblank = i;
+		if (!scissors_dashes_seen)
+			mark_start = i;
 		if (buf[i] == '-') {
-			in_perforation = 1;
-			perforation++;
+			mark_end = i;
+			scissors_dashes_seen |= 01;
 			continue;
 		}
 		if (i + 1 < len &&
 		    (!memcmp(buf + i, ">8", 2) || !memcmp(buf + i, "8<", 2))) {
-			in_perforation = 1;
-			perforation += 2;
-			scissors += 2;
 			i++;
+			mark_end = i;
+			scissors_dashes_seen |= 02;
 			continue;
 		}
-		in_perforation = 0;
+		break;
 	}
 
-	/*
-	 * The mark must be at least 8 bytes long (e.g. "-- >8 --").
-	 * Even though there can be arbitrary cruft on the same line
-	 * (e.g. "cut here"), in order to avoid misidentification, the
-	 * perforation must occupy more than a third of the visible
-	 * width of the line, and dashes and scissors must occupy more
-	 * than half of the perforation.
-	 */
+	if (scissors_dashes_seen == 03) {
+		/* strip trailing spaces at the end of the mark */
+		for (i = mark_end; i >= mark_start && i <= mark_end; i--) {
+			if (isspace(buf[i]))
+				mark_end--;
+			else
+				break;
+		}
 
-	visible = last_nonblank - first_nonblank + 1;
-	return (scissors && 8 <= visible &&
-		visible < perforation * 3 &&
-		gap * 2 < perforation);
+		mark_len = mark_end - mark_start + 1;
+		if (mark_len >= 8) {
+			/* ignore trailing spaces at the end of the line */
+			len--;
+			for (i = len - 1; i >= 0; i--) {
+				if (isspace(buf[i]))
+					len--;
+				else
+					break;
+			}
+			/*
+			 * The mark is 8 charaters long and contains at least one dash and
+			 * either a ">8" or "<8". Check if the last mark in the line
+			 * matches the first mark found without worrying about what could
+			 * be between them. Only one mark in the whole line is permitted.
+			 */
+			return (!memcmp(buf + mark_start, buf + len - mark_len, mark_len));
+		}
+	}
+
+	return 0;
 }
 
 static int handle_commit_msg(struct strbuf *line)
@@ -782,22 +796,25 @@ static int handle_commit_msg(struct strbuf *line)
 	if (metainfo_charset)
 		convert_to_utf8(line, charset.buf);
 
-	if (is_scissors_line(line)) {
-		int i;
-		rewind(cmitmsg);
-		ftruncate(fileno(cmitmsg), 0);
-		still_looking = 1;
+	if (!ignore_scissors) {
+		if (is_scissors_line(line)) {
+			warning("scissors line found, will skip text above");
+			int i;
+			rewind(cmitmsg);
+			ftruncate(fileno(cmitmsg), 0);
+			still_looking = 1;
 
-		/*
-		 * We may have already read "secondary headers"; purge
-		 * them to give ourselves a clean restart.
-		 */
-		for (i = 0; header[i]; i++) {
-			if (s_hdr_data[i])
-				strbuf_release(s_hdr_data[i]);
-			s_hdr_data[i] = NULL;
+			/*
+			 * We may have already read "secondary headers"; purge
+			 * them to give ourselves a clean restart.
+			 */
+			for (i = 0; header[i]; i++) {
+				if (s_hdr_data[i])
+					strbuf_release(s_hdr_data[i]);
+				s_hdr_data[i] = NULL;
+			}
+			return 0;
 		}
-		return 0;
 	}
 
 	if (patchbreak(line)) {
@@ -1011,6 +1028,8 @@ int cmd_mailinfo(int argc, const char **argv, const char *prefix)
 	while (1 < argc && argv[1][0] == '-') {
 		if (!strcmp(argv[1], "-k"))
 			keep_subject = 1;
+		else if (!strcmp(argv[1], "--ignore-scissors"))
+			ignore_scissors = 1;
 		else if (!strcmp(argv[1], "-u"))
 			metainfo_charset = def_charset;
 		else if (!strcmp(argv[1], "-n"))
diff --git a/git-am.sh b/git-am.sh
index 3c03f3e..17c883f 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -15,6 +15,7 @@ q,quiet         be quiet
 s,signoff       add a Signed-off-by line to the commit message
 u,utf8          recode into utf8 (default)
 k,keep          pass -k flag to git-mailinfo
+ignore-scissors pass --ignore-scissors flag to git-mailinfo
 whitespace=     pass it through git-apply
 ignore-space-change pass it through git-apply
 ignore-whitespace pass it through git-apply
@@ -288,7 +289,7 @@ split_patches () {
 prec=4
 dotest="$GIT_DIR/rebase-apply"
 sign= utf8=t keep= skip= interactive= resolved= rebasing= abort=
-resolvemsg= resume=
+resolvemsg= resume= ignore_scissors=
 git_apply_opt=
 committer_date_is_author_date=
 ignore_date=
@@ -310,6 +311,8 @@ do
 		utf8= ;;
 	-k|--keep)
 		keep=t ;;
+	--ignore-scissors)
+		ignore_scissors=t ;;
 	-r|--resolved)
 		resolved=t ;;
 	--skip)
@@ -435,7 +438,7 @@ else
 
 	split_patches "$@"
 
-	# -s, -u, -k, --whitespace, -3, -C, -q and -p flags are kept
+	# Following flags are kept
 	# for the resuming session after a patch failure.
 	# -i can and must be given when resuming.
 	echo " $git_apply_opt" >"$dotest/apply-opt"
@@ -443,6 +446,7 @@ else
 	echo "$sign" >"$dotest/sign"
 	echo "$utf8" >"$dotest/utf8"
 	echo "$keep" >"$dotest/keep"
+	echo "$ignore_scissors" >"$dotest/ignore-scissors"
 	echo "$GIT_QUIET" >"$dotest/quiet"
 	echo 1 >"$dotest/next"
 	if test -n "$rebasing"
@@ -484,6 +488,10 @@ if test "$(cat "$dotest/keep")" = t
 then
 	keep=-k
 fi
+if test "$(cat "$dotest/ignore-scissors")" = t
+then
+	ignore_scissors='--ignore-scissors'
+fi
 if test "$(cat "$dotest/quiet")" = t
 then
 	GIT_QUIET=t
@@ -538,7 +546,7 @@ do
 	# by the user, or the user can tell us to do so by --resolved flag.
 	case "$resume" in
 	'')
-		git mailinfo $keep $utf8 "$dotest/msg" "$dotest/patch" \
+		git mailinfo $keep $ignore_scissors $utf8 "$dotest/msg" "$dotest/patch" \
 			<"$dotest/$msgnum" >"$dotest/info" ||
 			stop_here $this
 
-- 
1.6.4.1.334.gf42e22

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]