On 05.01.2022 12:40, Junio C Hamano wrote:
Fabian Stelzer <fs@xxxxxxxxxxxx> writes:
How about something like this:
int string_find_line(char **line, size_t *len) {
const char *eol = NULL;
if (*len > 0) {
*line = *line + *len;
if (**line && **line == '\r')
(*line)++;
if (**line && **line == '\n')
(*line)++;
}
if (!**line)
return 0;
eol = strchrnul(*line, '\n');
/* Trim trailing CR from length */
if (eol > *line && eol[-1] == '\r')
eol--;
*len = eol - *line;
return 1;
}
It is a confusing piece of "we handle one line at a time" helper.
It is not obvious what the loop invariants are.
It would be most natural to readers if *line points at the very
beginning of the buffer, i.e. the beginning of the first line,
and *len points at the very first character of that line, i.e. 0.
But then the first thing this function worries about is a case where
*len is not 0. I obviously am biased, but sorry, I find what I gave
you 100 times simpler to understand.
There are a few more places where the same thing happens and text is just
split by LF, ignoring CR. The gpg parsing where this code originated being
the most prominent example. However those just parse some parts from the
output and the worst that seems to happen is a trailing CR in some log
outputs.
If we are ok with this then your version is indeed the better one. If we
want to correct the parsing at the other sites then I think a more
generalized function would be better. Since the gpg stuff is in place for a
long time and no one complained we can probably leave it as is. I'll prepare
a new patch.
Thanks