Re: [PATCH] fast-import: catch garbage after marks in from/merge

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 02 Apr 2012 09:15:32 -0700

Pete Wyckoff <pw@xxxxxxxx> writes:

> A forgotten LF can lead to a confusing bug.  The last
> line in this commit command is wrong:
> ...
> It is missing a newline and should be:
>
>     from :1
>     M 100644 :103 hello.c

During my first reading of this, this introductory part made me think "oh,
so this is a patch to fix somebody who produces a wrong data that is fed
to fast-import".  But it does not seem to be the case.

Please rephrase the first sentence.  Is it a confusing _bug_, or the
program produces a garbage output when fed a garbage output?

> Make fast-import complain about the buggy input, for both
> from and merge lines that use marks.

Perhaps these two lines, negated to state the current behaviour e.g. "git
fast-import does not complain when a mark that is used in 'from' or
'merge' command to name a commit is not followed by the mandatory LF." can
be used to replace the first sentence, followed by "It does X" to describe
the user-observable breakage for a bonus point.

> Signed-off-by: Pete Wyckoff <pw@xxxxxxxx>
> ---
> I spent too long tracking down the bug described in the
> commit message.  It might help future users if fast-import
> were to complain in this case.

It would help future users if the commit log actually described the
symptom caused by the bug; otherwise, future users would not notice when
hitting the same issue.

> diff --git a/fast-import.c b/fast-import.c
> index a85275d..13001bb 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -2537,8 +2537,16 @@ static int parse_from(struct branch *b)
>  		hashcpy(b->branch_tree.versions[0].sha1, t);
>  		hashcpy(b->branch_tree.versions[1].sha1, t);
>  	} else if (*from == ':') {
> -		uintmax_t idnum = strtoumax(from + 1, NULL, 10);
> -		struct object_entry *oe = find_mark(idnum);
> +		char *eptr;
> +		uintmax_t idnum = strtoumax(from + 1, &eptr, 10);
> +		struct object_entry *oe;
> +		if (eptr) {
> +			for (; *eptr && isspace(*eptr); eptr++) ;

Put the empty body on a separate line, i.e.

			for ( ; isspace(*eptr); eptr++)
				; /* nothing */

> +			if (*eptr)
> +				die("Garbage after mark: %s",
> +				    command_buf.buf);

Good.

> +		}
> +		oe = find_mark(idnum);
>  		if (oe->type != OBJ_COMMIT)
>  			die("Mark :%" PRIuMAX " not a commit", idnum);

Would it help future callers if you made this small part that parses a
mark into a separate small helper function that returns an oe and
increment the pointer so that the caller can peek at the terminating
character to enforce the syntax?  E.g.

	} else if (*from == ':') {
		char *cp = from + 1;
		struct object_entry *oe = parse_mark(&cp);
		if (*cp)
			die("Garbage after mark: %s", command_buf.buf);
                if (!oe || oe->type != OBJ_COMMIT)
                	die("No such commit: %s", command_buf.buf);
	}

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html