Re:! [PATCH/RFC] git-mailinfo: use strbuf's instead of fixed buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lukas Sandström <lukass@xxxxxxxxxxxxxxxx> writes:

> -static char *sanity_check(char *name, char *email)
> +static void sanity_check(struct strbuf *out, struct strbuf *name, struct strbuf *email)
>  {
> -	int len = strlen(name);
> -	if (len < 3 || len > 60)
> -		return email;
> -	if (strchr(name, '@') || strchr(name, '<') || strchr(name, '>'))
> -		return email;
> -	return name;
> +	struct strbuf o = STRBUF_INIT;
> +	if (name->len < 3 || name->len > 60)
> +		strbuf_addbuf(&o, email);
> +	if (strchr(name->buf, '@') || strchr(name->buf, '<') ||
> +		strchr(name->buf, '>'))
> +		strbuf_addbuf(&o, email);
> +	strbuf_addbuf(&o, name);
> +	strbuf_reset(out);
> +	strbuf_addbuf(out, &o);
> +	strbuf_release(&o);

This does not look like a correct conversion.  When name is too short or
too long, we do not even look at name and return email straight.  Perhaps
this would be more faithful conversion:

	struct strbuf *src = name;
	if (name->len < 3 ||
            60 < name->len ||
	    strchr(name->buf, '@') ||
	    strchr(name->buf, '<') ||
	    strchr(name->buf, '>'))
		src = email;
	else if (name == out)
        	return;
	strbuf_reset(out);
	strbuf_addbuf(out, src);

It is not your fault, but sanity_check() is a grave misnomer for this
function.  This does "get_sane_name" (i.e. we have name and email but if
name does not look right, use email instead).

> -static int bogus_from(char *line)
> +static int bogus_from(const struct strbuf *line)
>  {
>  	/* John Doe <johndoe> */
> -	char *bra, *ket, *dst, *cp;
>  
> +	char *bra, *ket;
>  	/* This is fallback, so do not bother if we already have an
>  	 * e-mail address.
>  	 */
> -	if (*email)
> +	if (email.len)
>  		return 0;
>  
> -	bra = strchr(line, '<');
> +	bra = strchr(line->buf, '<');
>  	if (!bra)
>  		return 0;
>  	ket = strchr(bra, '>');
>  	if (!ket)
>  		return 0;
>  
> -	for (dst = email, cp = bra+1; cp < ket; )
> -		*dst++ = *cp++;
> -	*dst = 0;
> -	for (cp = line; isspace(*cp); cp++)
> -		;
> -	for (bra--; isspace(*bra); bra--)
> -		*bra = 0;
> -	cp = sanity_check(cp, email);
> -	strcpy(name, cp);
> +	strbuf_reset(&email);
> +	strbuf_add(&email, bra + 1, ket - bra - 1);
> +
> +	strbuf_reset(&name);
> +	strbuf_add(&name, line->buf, bra - line->buf);
> +	strbuf_trim(&name);
> +	sanity_check(&name, &name, &email);
>  	return 1;
>  }

Conversion looks correct but its return value does not make much sense
(again, not your fault).  bogus_from() is given a bogus looking from line
(it is not about checking if it is bogus), and returns 0 if we already
have e-mail address, if the from line does not have bra-ket for grabbing
e-mail address for, but returns 1 if we managed to get name and email
pairs.  The inconsistency does not matter only because its sole caller
handle_from() returns its return value, and its caller discards it.  We
may be better off declaring this function and handle_from() as void.

> -static int handle_from(char *in_line)
> +static int handle_from(struct strbuf *from)
> ...
> +	el = strcspn(at, " \n\t\r\v\f>");
> +	strbuf_reset(&email);
> +	strbuf_add(&email, at, el);
> +	strbuf_remove(from, at - from->buf, el + 1);
>  	/* The remainder is name.  It could be "John Doe <john.doe@xz>"
>  	 * or "john.doe@xz (John Doe)", but we have whited out the
>  	 * email part, so trim from both ends, possibly removing
>  	 * the () pair at the end.
>  	 */

Now, it should read "but we have removed the email part", I think.

> +	strbuf_trim(from);
> +	if (*from->buf == '(')
> +		strbuf_remove(&name, 0, 1);
> +	if (*(from->buf + from->len - 1) == ')')

Can from be empty at this point before this check?

> +		strbuf_setlen(from, from->len - 1);
> +
> +	sanity_check(&name, from, &email);
>  	return 1;
>  }

We used to copy the data from the argument (in_line) before munging it in
this function, but now we are modifying it in place (from).  Does this
upset our caller, or the original code was just doing an extra unnecessary
copy?

> -static int handle_header(char *line, char *data, int ofs)
> +static void handle_header(struct strbuf **out, const struct strbuf *line)
>  {
> -	if (!line || !data)
> -		return 1;
> -
> -	strcpy(data, line+ofs);
> +	if (!*out) {
> +		*out = xmalloc(sizeof(struct strbuf));
> +		strbuf_init(*out, line->len);
> +	} else
> +		strbuf_reset(*out);
>  
> -	return 0;
> +	strbuf_addbuf(*out, (struct strbuf *)line); /* const warning */
>  }

I think its second parameter can safely become "const struct strbuf *";
perhaps we should fix the definition of strbuf_addbuf() in your first
patch?

> @@ -173,180 +153,176 @@ static int slurp_attr(const char *line, const char *name, char *attr)
>  	else
>  		ends = "; \t";
>  	sz = strcspn(ap, ends);
> -	memcpy(attr, ap, sz);
> -	attr[sz] = 0;
> +	strbuf_add(attr, ap, sz);
>  	return 1;
>  }
>  
>  struct content_type {
> -	char *boundary;
> -	int boundary_len;
> +	struct strbuf *boundary;
>  };
>  
>  static struct content_type content[MAX_BOUNDARIES];

Wouldn't it make more sense to get rid of "struct content_type" altogether
and use "struct strbuf *content[MAX_BOUNDARIES]" directly?

I'll review from handle_content_type() til the rest of the file
separately, as my concentration is wearing out..
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux