[PATCH] strbuf: always return a non-NULL value from strbuf_detach

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 18, 2012 at 03:49:21AM -0400, Jeff King wrote:

> This last line seems like it is caused by a bug in the strbuf API.
> Detaching an empty string will sometimes get you NULL and sometimes not.
> For example, this:
> 
>   struct strbuf foo = STRBUF_INIT;
>   strbuf_detach(&foo, NULL);
> 
> will return NULL. But this:
> 
>   struct strbuf foo = STRBUF_INIT;
>   strbuf_addstr(&foo, "bar");
>   strbuf_reset(&foo);
>   strbuf_detach(&foo, NULL);
> 
> will get you a zero-length string. Which just seems insane to me. The
> whole point of strbuf_detach is that you do not have to care about the
> internal representation. It should probably always return a newly
> allocated zero-length string.
> [...]
> It's possible that switching it would create bugs elsewhere (there are
> over 100 uses of strbuf_detach, so maybe somebody really does want this
> NULL behavior), but I tend to think it is just as likely to be fixing
> undiscovered bugs.

So I just read through all 108 grep hits for strbuf_detach. In almost
every case, the NULL return is not triggerable, because we do _some_
strbuf operation. And even if it is empty, like strbuf_read from an
empty source, or strbuf_addstr on an empty string, we still end up
calling `strbuf_grow(sb, 0)`, which will allocate.

There are a few cases where there are code paths where we really might
not add anything to the strbuf, and changing strbuf_detach would have an
impact:

  In log.c:cmd_format_patch, creating the rev.extra_headers string from
  the individual headers currently ends up NULL when you have no such
  headers. But it ends up not mattering if we have NULL or an empty
  string, since all code paths just end up appending it to our headers
  anyway, and the empty string is a noop.

  In commit.c:read_commit_extra_header_lines, a commit without a value
  will get a NULL value instead of the empty string. But we end up not
  dereferencing the NULL, because it just gets fed to add_extra_header
  later, which will only look at the value if its length parameter is
  non-zero. So it is built to expect the current behavior, but would be
  fine with a switch.

  In http-push.c:xml_entities, we will return NULL if fed an empty
  string. You can dereference NULL by doing "git http-push ''", although
  on glibc systems it will not segfault, because we feed the NULL to
  printf, which converts it to "(null)".

  In http-backend.c:get_parameters, we call url_decode_parameter_* to
  look at the contents of $QUERY_STRING.  These functions can return
  NULL via strbuf_detach if fed an empty string. We are ready to handle
  a NULL value like "empty=&other=value". But not an empty name, like
  "one=1&&two=2" (note the bogus double-& which yields an empty
  parameter). You can get a segfault by sending that to a smart-http
  server.

Probably more detail than you wanted, but I'm pretty confident now that
we should switch it, and that it won't cause any regressions. Patch is
below.

-- >8 --
Subject: strbuf: always return a non-NULL value from strbuf_detach

The current behavior is to return NULL when strbuf did not
actually allocate a string. This can be quite surprising to
callers, though, who may feed the strbuf from arbitrary data
and expect to always get a valid value.

In most cases, it does not make a difference because calling
any strbuf function will cause an allocation (even if the
function ends up not inserting any data). But if the code is
structured like:

  struct strbuf buf = STRBUF_INIT;
  if (some_condition)
	  strbuf_addstr(&buf, some_string);
  return strbuf_detach(&buf, NULL);

then you may or may not return NULL, depending on the
condition. This can cause us to segfault in http-push
(when fed an empty URL) and in http-backend (when an empty
parameter like "foo=bar&&" is in the $QUERY_STRING).

This patch forces strbuf_detach to allocate an empty
NUL-terminated string when it is called on a strbuf that has
not been allocated.

I investigated all call-sites of strbuf_detach. The majority
are either not affected by the change (because they call a
strbuf_* function unconditionally), or can handle the empty
string just as easily as NULL.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 strbuf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index 0510f76..4b9e30c 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -44,7 +44,9 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
 
 char *strbuf_detach(struct strbuf *sb, size_t *sz)
 {
-	char *res = sb->alloc ? sb->buf : NULL;
+	char *res;
+	strbuf_grow(sb, 0);
+	res = sb->buf;
 	if (sz)
 		*sz = sb->len;
 	strbuf_init(sb, 0);
-- 
1.8.0.rc3.3.gba630e1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]