Re: [PATCH v4 03/10] trailer: teach iterator about non-trailer lines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Linus

Sorry I'm late to the party here I've left a couple of thoughts below but I don't want to derail this series if everyone else is happy.

On 02/05/2024 05:54, Linus Arver via GitGitGadget wrote:
From: Linus Arver <linus@xxxxxxxx>

Previously the iterator did not iterate over non-trailer lines. This was
somewhat unfortunate, because trailer blocks could have non-trailer
lines in them since 146245063e (trailer: allow non-trailers in trailer
block, 2016-10-21), which was before the iterator was created in
f0939a0eb1 (trailer: add interface for iterating over commit trailers,
2020-09-27).

So if trailer API users wanted to iterate over all lines in a trailer
block (including non-trailer lines), they could not use the iterator and
were forced to use the lower-level trailer_info struct directly (which
provides a raw string array that includes all lines in the trailer
block).

Change the iterator's behavior so that we also iterate over non-trailer
lines, instead of skipping over them. The new "raw" member of the
iterator allows API users to access previously inaccessible non-trailer
lines. Reword the variable "trailer" to just "line" because this
variable can now hold both trailer lines _and_ non-trailer lines.

The new "raw" member is important because anyone currently not using the
iterator is using trailer_info's raw string array directly to access
lines to check what the combined key + value looks like. If we didn't
provide a "raw" member here, iterator users would have to re-construct
the unparsed line by concatenating the key and value back together again
--- which places an undue burden for iterator users.

Comparing the raw line is error prone as it ignores custom separators and variations in the amount of space between the key and the value. Therefore I'd argue that the sequencer should in fact be comparing the trailer key and value separately rather than comparing the whole line. There is an issue that we want to add a new Signed-off-by: trailer for "C.O. Mitter" when the trailers look like

	Signed-off-by: C.O. Mitter <c.o.mitter@xxxxxxxxxxx>
	non-trailer-line

but not when they look like

	Signed-off-by: C.O. Mitter <c.o.mitter@xxxxxxxxxxx>

so we still need some way of indicating that there was a non-trailer line after the last trailer though.

The next commit demonstrates the use of the iterator in sequencer.c as an
example of where "raw" will be useful, so that it can start using the
iterator.

For the existing use of the iterator in builtin/shortlog.c, we don't
have to change the code there because that code does

An interface that lets the caller pass a flag if they want to know about non-trailer lines might be easier to use for the callers that don't want to worry about such lines and wouldn't need a justification as to why it was safe for existing callers.

Best Wishes

Phillip

     trailer_iterator_init(&iter, body);
     while (trailer_iterator_advance(&iter)) {
         const char *value = iter.val.buf;

         if (!string_list_has_string(&log->trailers, iter.key.buf))
             continue;

         ...

and the

         if (!string_list_has_string(&log->trailers, iter.key.buf))

condition already skips over non-trailer lines (iter.key.buf is empty
for non-trailer lines, making the comparison still work even with this
commit).

Rename "num_expected_trailers" to "num_expected" in
t/unit-tests/t-trailer.c because the items we iterate over now include
non-trailer lines.

Signed-off-by: Linus Arver <linus@xxxxxxxx>
---
  t/unit-tests/t-trailer.c | 16 +++++++++++-----
  trailer.c                | 12 +++++-------
  trailer.h                |  7 +++++++
  3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/t/unit-tests/t-trailer.c b/t/unit-tests/t-trailer.c
index c1f897235c7..4f640d2a4b8 100644
--- a/t/unit-tests/t-trailer.c
+++ b/t/unit-tests/t-trailer.c
@@ -1,7 +1,7 @@
  #include "test-lib.h"
  #include "trailer.h"
-static void t_trailer_iterator(const char *msg, size_t num_expected_trailers)
+static void t_trailer_iterator(const char *msg, size_t num_expected)
  {
  	struct trailer_iterator iter;
  	size_t i = 0;
@@ -11,7 +11,7 @@ static void t_trailer_iterator(const char *msg, size_t num_expected_trailers)
  		i++;
  	trailer_iterator_release(&iter);
- check_uint(i, ==, num_expected_trailers);
+	check_uint(i, ==, num_expected);
  }
static void run_t_trailer_iterator(void)
@@ -19,7 +19,7 @@ static void run_t_trailer_iterator(void)
  	static struct test_cases {
  		const char *name;
  		const char *msg;
-		size_t num_expected_trailers;
+		size_t num_expected;
  	} tc[] = {
  		{
  			"empty input",
@@ -119,7 +119,13 @@ static void run_t_trailer_iterator(void)
  			"not a trailer line\n"
  			"not a trailer line\n"
  			"Signed-off-by: x\n",
-			1
+			/*
+			 * Even though there is only really 1 real "trailer"
+			 * (Signed-off-by), we still have 4 trailer objects
+			 * because we still want to iterate through the entire
+			 * block.
+			 */
+			4
  		},
  		{
  			"with non-trailer lines (one too many) in trailer block",
@@ -162,7 +168,7 @@ static void run_t_trailer_iterator(void)
for (int i = 0; i < sizeof(tc) / sizeof(tc[0]); i++) {
  		TEST(t_trailer_iterator(tc[i].msg,
-					tc[i].num_expected_trailers),
+					tc[i].num_expected),
  		     "%s", tc[i].name);
  	}
  }
diff --git a/trailer.c b/trailer.c
index 3e4dab9c065..4700c441442 100644
--- a/trailer.c
+++ b/trailer.c
@@ -1146,17 +1146,15 @@ void trailer_iterator_init(struct trailer_iterator *iter, const char *msg)
int trailer_iterator_advance(struct trailer_iterator *iter)
  {
-	while (iter->internal.cur < iter->internal.info.trailer_nr) {
-		char *trailer = iter->internal.info.trailers[iter->internal.cur++];
-		int separator_pos = find_separator(trailer, separators);
-
-		if (separator_pos < 1)
-			continue; /* not a real trailer */
+	if (iter->internal.cur < iter->internal.info.trailer_nr) {
+		char *line = iter->internal.info.trailers[iter->internal.cur++];
+		int separator_pos = find_separator(line, separators);
+ iter->raw = line;
  		strbuf_reset(&iter->key);
  		strbuf_reset(&iter->val);
  		parse_trailer(&iter->key, &iter->val, NULL,
-			      trailer, separator_pos);
+			      line, separator_pos);
  		/* Always unfold values during iteration. */
  		unfold_value(&iter->val);
  		return 1;
diff --git a/trailer.h b/trailer.h
index 9f42aa75994..7e36da7d13c 100644
--- a/trailer.h
+++ b/trailer.h
@@ -125,6 +125,13 @@ void format_trailers_from_commit(const struct process_trailer_options *,
   *   trailer_iterator_release(&iter);
   */
  struct trailer_iterator {
+	/*
+	 * Raw line (e.g., "foo: bar baz") before being parsed as a trailer
+	 * key/val pair as part of a trailer block (as the "key" and "val"
+	 * fields below). If a line fails to parse as a trailer, then the "key"
+	 * will be the entire line and "val" will be the empty string.
+	 */
+	const char *raw;
  	struct strbuf key;
  	struct strbuf val;




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux