Re: [PATCH] t6300: values containing ')' are broken in ref formats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 05, 2024 at 07:05:13PM -0800, Junio C Hamano wrote:
> Jeff King <peff@xxxxxxxx> writes:
> 
> > I am tempted to say the solution is to expand that "equals" value, and
> > possibly add some less-arcane version of the character (maybe "%)"?).
> > But it be a break in backwards compatibility if somebody is trying to
> > match literal %-chars in their "if" block.
> 
> If they were trying to write a literal %, wouldn't they be writing
> %% already, not because % followed by a byte without any special
> meaning happens to be passed intact by the implementation, but
> because that is _the_ right thing to do, when % is used as an
> introducer for escape sequences?  So I do agree it would be a change
> that breaks backward compatibility but I do not think we want to
> stay bug to bug compatible with the current behaviour here.  I am
> not sure with the wisdom of %) though.  Wouldn't "%(foo %)" look as
> if %( opens and %) closes a group in our language?
> 
> So I am very much in favor of this "if condition should be expanded
> before comparison" solution.

I had worked on this "if condition should be expanded before comparison"
solution but Christian and I agreed that it would be better to open up
discussion incase there would be other possible and better solutions
which would also be viable in the long term.

This solution relies on how we parse out literals in ref-filter, so '%%'
would work as intended in the comparision string - ie if we want to
compare against "some%ref", we would do "%(if:equals=some%%ref)...".

Also it is obscure that someone will use this in practice as I myself
had discovered this when working on a corner case of some other
implementation related to parsing but way lower in the call-chain of
ref-filter ;) but here goes

(this applies on top of the current patch)

------------------------ >8 ------------------------
Subject: [PATCH] ref-filter: parse parentheses correctly in %(if) atoms

Having a ')' in "<string>" in ":equals=<string>" or
":notequals=<string>" wouldn't parse correctly in ref-filter as
documented in the previous commit.

One way to fix this is refactoring the way in which we parse our format
string.  Although this would mean we would have to do a huge refactoring
as this step happens very high up in the call chain.

Therefore, support including parenthesis characters in "<string>" by
instead giving their hexcode equivalents - as a for-now hack.

Do this by further abstracting "append_literal()" to "parse_literal()"
where the output is no longer stored into a ref formatting stack's
strbuf but a standard standalone strbuf.  append_literal would then
hence wrap appropriately around "parse_literal()".

Also introduce "convert_hexcode()" which also wraps around
"parse_literal()" and must be used to convert hexcode in a given string
and be silent when such a string doesn't contain hexcode or doesn't
exist (ie is NULL).

Using "convert_hexcode()" would mean that we now have an alloced string -
hence free() it once we are done with it to prevent any memory leaks.

Signed-off-by: Kousik Sanagavarapu <five231003@xxxxxxxxx>
---
 Documentation/git-for-each-ref.txt |  4 ++
 ref-filter.c                       | 76 +++++++++++++++++++++++-------
 t/t6300-for-each-ref.sh            |  6 ++-
 3 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index d3764401a2..ce12400040 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -221,6 +221,10 @@ if::
 	the value between the %(if:...) and %(then) atoms with the
 	given string.
 
+	Additionally, if `<string>` must contain parenthesis, then these
+	parentheses are spelled out as hexcode.  For e.g., `1)someref`
+	would need to be `1%29someref`.
+
 symref::
 	The ref which the given symbolic ref refers to. If not a
 	symbolic ref, nothing is printed. Respects the `:short`,
diff --git a/ref-filter.c b/ref-filter.c
index 84c6036107..ebdb2daeb7 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1234,13 +1234,64 @@ static void if_then_else_handler(struct ref_formatting_stack **stack)
 	*stack = cur;
 }
 
+/*
+ * Parse out literals in the string pointed to by "cp" and store them in
+ * a strbuf - this would go on until we hit NUL or "ep".
+ *
+ * While at it, if they're of the form "%xx", where xx represents the
+ * hexcode of some character, then convert them into their equivalent
+ * characters.
+ */
+static void parse_literal(const char *cp, const char *ep,
+			  struct strbuf *s)
+{
+	while (*cp && (!ep || cp < ep)) {
+		if (*cp == '%') {
+			if (cp[1] == '%')
+				cp++;
+			else {
+				int ch = hex2chr(cp + 1);
+				if (0 <= ch) {
+					strbuf_addch(s, ch);
+					cp += 3;
+					continue;
+				}
+			}
+		}
+		strbuf_addch(s, *cp);
+		cp++;
+	}
+}
+
+/*
+ * Convert the string, pointed to by "cp", which might or might not
+ * contain hexcode (in the format of "%xx" where xx is the hexcode) to
+ * its character-equivalent string and return it.
+ *
+ * If the string does not contain any hexcode - then it is returned as
+ * is.
+ */
+static const char *convert_hexcode(const char *cp)
+{
+	struct strbuf s = STRBUF_INIT;
+
+	if (!cp)
+		return NULL;
+	/*
+	 * This has the effect of an in-place translation but
+	 * implementation-wise it is not.
+	 */
+	parse_literal(cp, NULL, &s);
+	return strbuf_detach(&s, NULL);
+}
+
 static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
 			   struct strbuf *err UNUSED)
 {
 	struct ref_formatting_stack *new_stack;
 	struct if_then_else *if_then_else = xcalloc(1, sizeof(*if_then_else));
 
-	if_then_else->str = atomv->atom->u.if_then_else.str;
+	if_then_else->str = convert_hexcode(atomv->atom->u.if_then_else.str);
 	if_then_else->cmp_status = atomv->atom->u.if_then_else.cmp_status;
 
 	push_stack_element(&state->stack);
@@ -1296,6 +1347,9 @@ static int then_atom_handler(struct atom_value *atomv UNUSED,
 			if_then_else->condition_satisfied = 1;
 	} else if (cur->output.len && !is_empty(&cur->output))
 		if_then_else->condition_satisfied = 1;
+
+	if (if_then_else->str)
+		free((char *)if_then_else->str);
 	strbuf_reset(&cur->output);
 	return 0;
 }
@@ -3425,26 +3479,12 @@ void ref_array_sort(struct ref_sorting *sorting, struct ref_array *array)
 		QSORT_S(array->items, array->nr, compare_refs, sorting);
 }
 
-static void append_literal(const char *cp, const char *ep, struct ref_formatting_state *state)
+static void append_literal(const char *cp, const char *ep,
+			   struct ref_formatting_state *state)
 {
 	struct strbuf *s = &state->stack->output;
 
-	while (*cp && (!ep || cp < ep)) {
-		if (*cp == '%') {
-			if (cp[1] == '%')
-				cp++;
-			else {
-				int ch = hex2chr(cp + 1);
-				if (0 <= ch) {
-					strbuf_addch(s, ch);
-					cp += 3;
-					continue;
-				}
-			}
-		}
-		strbuf_addch(s, *cp);
-		cp++;
-	}
+	parse_literal(cp, ep, s);
 }
 
 int format_ref_array_item(struct ref_array_item *info,
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index ce5c607193..c9383d23a4 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -2141,7 +2141,7 @@ test_expect_success GPG 'show lack of signature with custom format' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'format having values containing ) parse correctly' '
+test_expect_success 'format having values containing ) parse correctly' '
 	git branch "1)feat" &&
 	cat >expect <<-\EOF &&
 	refs/heads/1)feat
@@ -2151,7 +2151,9 @@ test_expect_failure 'format having values containing ) parse correctly' '
 	not equals
 	not equals
 	EOF
-	git for-each-ref --format="%(if:equals=1)feat)%(refname:short)%(then)%(refname)%(else)not equals%(end)" \
+
+	# 29 is the hexcode of )
+	git for-each-ref --format="%(if:equals=1%29feat)%(refname:short)%(then)%(refname)%(else)not equals%(end)" \
 		refs/heads/ >actual &&
 	test_cmp expect actual
 '
-- 
2.47.0.230.g0cf584699a





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux