Christian Couder <christian.couder@xxxxxxxxx> writes: > (Sorry I just realized that I had sent this email to Linus only.) > > On Fri, Apr 26, 2024 at 2:26 AM Linus Arver via GitGitGadget > <gitgitgadget@xxxxxxxxx> wrote: >> >> From: Linus Arver <linusa@xxxxxxxxxx> >> >> Previously the iterator did not iterate over non-trailer lines. This was >> somewhat unfortunate, because trailer blocks could have non-trailer >> lines in them since 146245063e (trailer: allow non-trailers in trailer >> block, 2016-10-21), which was before the iterator was created in >> f0939a0eb1 (trailer: add interface for iterating over commit trailers, >> 2020-09-27). >> >> So if trailer API users wanted to iterate over all lines in a trailer >> block (including non-trailer lines), they could not use the iterator and >> were forced to use the lower-level trailer_info struct directly (which >> provides a raw string array that includes all lines in the trailer >> block). >> >> Change the iterator's behavior so that we also iterate over non-trailer >> lines, instead of skipping over them. The new "raw" member of the >> iterator allows API users to access previously inaccessible non-trailer >> lines. Reword the variable "trailer" to just "line" because this >> variable can now hold both trailer lines _and_ non-trailer lines. >> >> The new "raw" member is important because anyone currently not using the >> iterator is using trailer_info's raw string array directly to access >> lines to check what the combined key + value looks like. If we didn't >> provide a "raw" member here, iterator users would have to re-construct >> the unparsed line by concatenating the key and value back together again >> --- which places an undue burden for iterator users. >> >> The next commit demonstrates the use of the iterator in sequencer.c as an >> example of where "raw" will be useful, so that it can start using the >> iterator. >> >> For the existing use of the iterator in builtin/shortlog.c, we don't >> have to change the code there because that code does >> >> trailer_iterator_init(&iter, body); >> while (trailer_iterator_advance(&iter)) { >> const char *value = iter.val.buf; >> >> if (!string_list_has_string(&log->trailers, iter.key.buf)) >> continue; >> >> ... >> >> and the >> >> if (!string_list_has_string(&log->trailers, iter.key.buf)) >> >> condition already skips over non-trailer lines (iter.key.buf is empty >> for non-trailer lines, making the comparison still work even with this >> commit). >> >> Rename "num_expected_trailers" to "num_expected_objects" in >> t/unit-tests/t-trailer.c because the items we iterate over now include >> non-trailer lines. > > I think it would be simpler if the previous patch used just > "num_expected" or "expected". It's not like the other fields in the > struct ("msg" and "name") are very explicit, so why this one only? I didn't give it much thought TBH. "num_expected" SGTM. Will update. >> Signed-off-by: Linus Arver <linusa@xxxxxxxxxx> > > >> diff --git a/trailer.c b/trailer.c >> index 3e4dab9c065..4700c441442 100644 >> --- a/trailer.c >> +++ b/trailer.c >> @@ -1146,17 +1146,15 @@ void trailer_iterator_init(struct trailer_iterator *iter, const char *msg) >> >> int trailer_iterator_advance(struct trailer_iterator *iter) >> { >> - while (iter->internal.cur < iter->internal.info.trailer_nr) { >> - char *trailer = iter->internal.info.trailers[iter->internal.cur++]; >> - int separator_pos = find_separator(trailer, separators); >> - >> - if (separator_pos < 1) >> - continue; /* not a real trailer */ >> + if (iter->internal.cur < iter->internal.info.trailer_nr) { >> + char *line = iter->internal.info.trailers[iter->internal.cur++]; >> + int separator_pos = find_separator(line, separators); >> >> + iter->raw = line; >> strbuf_reset(&iter->key); >> strbuf_reset(&iter->val); >> parse_trailer(&iter->key, &iter->val, NULL, >> - trailer, separator_pos); >> + line, separator_pos); >> /* Always unfold values during iteration. */ >> unfold_value(&iter->val); >> return 1; >> diff --git a/trailer.h b/trailer.h >> index 9f42aa75994..ebafa3657e4 100644 >> --- a/trailer.h >> +++ b/trailer.h >> @@ -125,6 +125,14 @@ void format_trailers_from_commit(const struct process_trailer_options *, >> * trailer_iterator_release(&iter); >> */ >> struct trailer_iterator { >> + /* >> + * Raw line (e.g., "foo: bar baz") before being parsed as a trailer >> + * key/val pair as part of a trailer block. A trailer block can be >> + * either 100% trailer lines, or mixed in with non-trailer lines (in >> + * which case at least 25% must be trailer lines). > > I don't think 25% is important here. SG, will remove 25% language (FWIW we already have such language in trailer.c if devs want to take a more closer look, so it's not like we're losing any info overall). > What is more important is to just > say that this field could not be an actual trailer, and to tell what > the 'key' and 'val' fields below will contain then. Will update. > >> + */ >> + const char *raw; >> + >> struct strbuf key; >> struct strbuf val; BTW I will be on vacation for the next several weeks. However as the suggested changes are minor, I think I can still get to them and push up a v4 sometime this week. Cheers.