Re: [PATCH 0/18] snprintf cleanups

Jeff King <peff@xxxxxxxx> · Tue, 28 Mar 2017 23:41:05 -0400

On Tue, Mar 28, 2017 at 03:33:48PM -0700, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > It's a lot of patches, but hopefully they're all pretty straightforward
> > to read.
> 
> Yes, quite a lot of changes.  I didn't see anything questionable in
> there.
> 
> As to the "patch-id" thing, I find the alternate one slightly easier
> to read.  Also, exactly because this is not a performance critical
> codepath, it may be better if patch_id_add_string() filtered out
> whitespaces; that would allow the source to express things in more
> natural way, e.g.
> 
> 		patch_id_addf(&ctx, "new file mode");
> 		patch_id_addf(&ctx, "%06o", p->two->mode);
> 		patch_id_addf(&ctx, "--- /dev/null");
> 		patch_id_addf(&ctx, "+++ b/%.*s", len2, p->two->path);
> 
> Or I may be going overboard by bringing "addf" into the mix X-<.

I think there are two things going on in your example.

One is that obviously patch_id_addf() removes the spaces from the
result. But we could do that now by keeping the big strbuf_addf(), and
then just walking the result and feeding non-spaces.

The second is that your addf means we are back to formatting everything
into a buffer again. And it has to be dynamic to handle the final line
there, because "len2" isn't bounded. At which point we may as well go
back to sticking it all in one big strbuf (your example also breaks it
down line by line, but we could do that with separte strbuf_addf calls,
too).

Or you have to reimplement the printf format-parsing yourself, and write
into the sha1 instead of into the buffers. But that's probably insane.

I think the "no extra buffer with whitespace" combo is more like:

  void patch_id_add_buf(git_SHA1_CTX *ctx, const char *buf, size_t len)
  {
	for (; len > 0; buf++, len--) {
		if (!isspace(*buf))
			git_SHA1_Update(ctx, buf, 1);
	}
  }

  void patch_id_add_str(git_SHA1_CTX *ctx, const char *str)
  {
	patch_id_add_buf(ctx, strlen(str));
  }

  void patch_id_add_mode(git_SHA1_CTX *ctx, unsigned mode)
  {
	char buf[16]; /* big enough... */
	int len = xsnprintf(buf, "%06o", mode);
	patch_id_add_buf(ctx, buf, len);
  }

  patch_id_add_str(&ctx, "new file mode");
  patch_id_add_mode(&ctx, p->two->mode);
  patch_id_add_str(&ctx, "--- /dev/null");
  patch_id_add_str(&ctx, "+++ b/");
  patch_id_add_buf(&ctx, p->two->path, len2);

I dunno. I wondered if feeding single bytes to the sha1 update might
actually be noticeably slower, because I would assume that internally it
generally copies data in larger chunks. I didn't measure it, though.

-Peff