Re: [PATCH 4/5] ref-filter: truncate atom names in error messages

Jeff King <peff@xxxxxxxx> · Wed, 14 Dec 2022 15:39:34 -0500

On Wed, Dec 14, 2022 at 03:05:05PM -0500, Taylor Blau wrote:

> On Wed, Dec 14, 2022 at 11:23:53AM -0500, Jeff King wrote:
> > It seems like the cleanest fix would be for atom->name to be _just_ the
> > name, since there's already a separate "args" field. But since that
> > field is also used for other things, we can't change it easily (e.g.,
> > it's how we find things in the used_atoms array, and clearly %(refname)
> > and %(refname:short) are not the same thing).
> >
> > Instead, we'll teach our error_bad_arg() function to stop at the first
> > ":". This is a little hacky, as we're effectively re-parsing the name,
> > but the format is simple enough to do this as a one-liner, and this
> > localizes the change to the error-reporting code.
> >
> > We'll give the same treatment to err_no_arg(). None of its callers use
> > this atom->name trick, but it's worth future-proofing it while we're
> > here.
> 
> For what it's worth, I think that this balance of a somewhat-hacky
> implementation against a more significant and trickier refactoring is
> well thought-out and the right decision, IMHO.

By the way, I did try the other change, to make atom->name just contain
the name with no args. There are a bunch of pitfalls in
parse_ref_filter_atom(), including:

  - don't use atom_len; it's off-by-one when looking at "atom" and not
    "sp" when there's a "*" dereference

  - the "args" pointer in the struct actually points into the name
    string. I don't think anybody relies on that, but I'm not 100% sure
    because...

  - once you deal with that, then it segfaults mysteriously, because
    the numeric index between used_atom and the computed values gets out
    of sync. That's where I gave up.

Which isn't to say it isn't do-able, or it wouldn't even make the
ref-filter code cleaner overall if somebody did that refactoring. But it
seemed like too much for solving this one little problem.

Here's the patch where I stopped, for posterity:

diff --git a/ref-filter.c b/ref-filter.c
index caf10ab23e..3a2a7d0271 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -716,7 +716,7 @@ static int parse_ref_filter_atom(struct ref_format *format,
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
 	used_atom[at].atom_type = i;
-	used_atom[at].name = xmemdupz(atom, ep - atom);
+	used_atom[at].name = xmemdupz(atom, (arg ? arg : ep) - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
 	used_atom[at].source = valid_atom[i].source;
 	if (used_atom[at].source == SOURCE_OBJ) {
@@ -726,8 +726,10 @@ static int parse_ref_filter_atom(struct ref_format *format,
 			oi.info.contentp = &oi.content;
 	}
 	if (arg) {
-		arg = used_atom[at].name + (arg - atom) + 1;
-		if (!*arg) {
+		arg++; /* skip ':' */
+		if (arg < ep) {
+			arg = xmemdupz(arg, ep - arg);
+		} else {
 			/*
 			 * Treat empty sub-arguments list as NULL (i.e.,
 			 * "%(atom:)" is equivalent to "%(atom)").

Its relative shortness does not represent the great confusion I had in
producing it.

-Peff