Hi, On Fri, 16 May 2008, Brian Foster wrote: > Johannes Schindelin suggested: > > The function fgets() has a big problem with NUL characters: it reads > > them, but nobody will know if the NUL comes from the file stream, or > > was appended at the end of the line. > > > > So implement a custom read_line() function. > ^^^^^^^^^^^ > read_line_with_nul() > meaning read part or all of one line which may contain NULs. Right. > >[ ... ] > > diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c > > index 46b27cd..021dc16 100644 > > --- a/builtin-mailsplit.c > > +++ b/builtin-mailsplit.c > > @@ -45,6 +45,25 @@ static int is_from_line(const char *line, int len) > > /* Could be as small as 64, enough to hold a Unix "From " line. */ > > static char buf[4096]; > > > > +/* We cannot use fgets() because our lines can contain NULs */ > > +int read_line_with_nul(char *buf, int size, FILE *in) > > +{ > > + int len = 0, c; > > + > > + for (;;) { > > + c = fgetc(in); > > + buf[len++] = c; > > + if (c == EOF || c == '\n' || len + 1 >= size) > > + break; > > + } > > + > > + if (c == EOF) > > + len--; > > + buf[len] = '\0'; > > + > > + return len; > > when fgetc(3) — why not use getc(3)? - Because mailsplit can read from a file, too. > returns EOF it is pointlessly stored in buf[] (as a 'char'!), len's > advanced, and then the storage and advancing are undone. isn't that a > bit silly? I left it at that, because it is a rare case, the buffer has to be accessed with the trailing NUL anyway, and I think it is worth to have this function quite readable. I, for one, am pretty certain that I understand what this function does, and how, in 6 months from now, without any additional documentation. > untested: > > assert(2 <= size); > do { > if ((c = getc(in)) == EOF) > break; > } while (((buf[len++] = c) != '\n' && len+1 < size); > buf[len] = '\0' > > return len; ... except this is unreadable at best ;-) > I'd tend to write this in terms of pointers, > something along the lines (untested): > > char *p, *endp; > > assert(1 <= size); > p = buf; > endp = p + (size-1); > while (p < endp) { > if ((c = getc(in)) == EOF || (*p++ = c) == '\n') > break; > } > *p = '\0'; > > return p - buf; Again, I think this is too cuddled. You have to think about every second line, and that makes for stupid mistakes with this developer. Ciao, Dscho