Re: [PATCH 10/26] Check for EOF while parsing mails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hannes,

On Thu, 27 Apr 2017, Johannes Sixt wrote:

> Am 26.04.2017 um 22:20 schrieb Johannes Schindelin:
>
> > diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
> > index 30681681c13..c0d88f97512 100644
> > --- a/builtin/mailsplit.c
> > +++ b/builtin/mailsplit.c
> > @@ -232,7 +232,7 @@ static int split_mbox(const char *file, const char *dir,
> > int allow_bare,
> >
> >   do {
> > 		peek = fgetc(f);
> > -	} while (isspace(peek));
> > +	} while (peek >= 0 && isspace(peek));
> >   ungetc(peek, f);
> >
> > 	if (strbuf_getwholeline(&buf, f, '\n')) {
> > diff --git a/mailinfo.c b/mailinfo.c
> > index 68037758f2f..60dcad7b714 100644
> > --- a/mailinfo.c
> > +++ b/mailinfo.c
> > @@ -1099,7 +1099,7 @@ int mailinfo(struct mailinfo *mi, const char *msg,
> > const char *patch)
> >
> >   do {
> > 		peek = fgetc(mi->input);
> > -	} while (isspace(peek));
> > +	} while (peek >= 0 && isspace(peek));
> >   ungetc(peek, mi->input);
> >
> >   /* process the email header */
> >
> 
> Why? isspace(EOF) is well-defined.

So let's look at the man page on Linux:

	These functions check whether c,  which  must  have  the  value  of  an
	unsigned char or EOF, [...]

That is the only mention of it. I find it highly unobvious whether EOF
should be treated as a space or not. So let's look at the MSDN page
(https://msdn.microsoft.com/en-us/library/y13z34da.aspx) whether they talk
more about EOF:

	The behavior of isspace and _isspace_l is undefined if c is not
	EOF or in the range 0 through 0xFF, inclusive.

That's it. So I kind of *guess* that EOF is treated as not being a
whitespace character (why does this make me think of politics now? Focus,
Johannes, focus...). But the mathematician in me protests: why would we
be able to decide the character class of a character that does not exist?

Technically, you are correct, of course. The specs of fgetc() specify
quite clearly that either an unsigned char cast to an int is returned, or
EOF on end-of-file *or error*. And a quick test verifies that isspace(EOF)
returns 0.

But then, I guess I misunderstood what Coverity complained about: maybe
the problem was not so much the isspace() call but that EOF is not being
handled correctly. We pass it, unchecked, to ungetc().

It appears that I (or Coverity, if you will), missed another instance
where we simply passed EOF unchecked to ungetc().

The next iteration will have it completely reworked: I no longer guard the
isspace() behind an `!= EOF` check, but rather handle an early EOF as I
think it should be handled. Extra eyes very welcome (this is the fixup!
patch):

-- snip --
diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index c0d88f97512..9b3efc8e987 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -232,8 +232,9 @@ static int split_mbox(const char *file, const char *dir, int allow_bare,
 
 	do {
 		peek = fgetc(f);
-	} while (peek >= 0 && isspace(peek));
-	ungetc(peek, f);
+	} while (isspace(peek));
+	if (peek != EOF)
+		ungetc(peek, f);
 
 	if (strbuf_getwholeline(&buf, f, '\n')) {
 		/* empty stdin is OK */
diff --git a/mailinfo.c b/mailinfo.c
index 60dcad7b714..a319911b510 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -882,7 +882,10 @@ static int read_one_header_line(struct strbuf *line, FILE *in)
 	for (;;) {
 		int peek;
 
-		peek = fgetc(in); ungetc(peek, in);
+		peek = fgetc(in);
+		if (peek == EOF)
+			break;
+		ungetc(peek, in);
 		if (peek != ' ' && peek != '\t')
 			break;
 		if (strbuf_getline_lf(&continuation, in))
@@ -1094,14 +1097,18 @@ int mailinfo(struct mailinfo *mi, const char *msg, const char *patch)
 		return -1;
 	}
 
-	mi->p_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*(mi->p_hdr_data)));
-	mi->s_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*(mi->s_hdr_data)));
-
 	do {
 		peek = fgetc(mi->input);
-	} while (peek >= 0 && isspace(peek));
+		if (peek == EOF) {
+			fclose(cmitmsg);
+			return error("empty patch: '%s'", patch);
+		}
+	} while (isspace(peek));
 	ungetc(peek, mi->input);
 
+	mi->p_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*(mi->p_hdr_data)));
+	mi->s_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*(mi->s_hdr_data)));
+
 	/* process the email header */
 	while (read_one_header_line(&line, mi->input))
 		check_header(mi, &line, mi->p_hdr_data, 1);
-- snap --

In the first hunk, I simply rely on the code after ungetc() to figure out
that there are no headers and to handle that case as before.

The second hunk handles the case where looking for a continuation line in
the header section hits EOF; it is still a valid header, but we should
avoid ungetc(EOF) to allow the next read to report EOF correctly.

The third hunk moves the malloc()s around so that we can complain about an
empty patch, close the file and return an error value without leaking
memory.

Ciao,
Dscho



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]