Re: [PATCH 03/14] copy_fd: pass error message back through a strbuf

Jeff King <peff@xxxxxxxx> · Wed, 3 Dec 2014 16:00:31 -0500

On Wed, Dec 03, 2014 at 11:01:08AM -0800, Junio C Hamano wrote:

> Jonathan Nieder <jrnieder@xxxxxxxxx> writes:
> 
> > -extern int copy_fd(int ifd, int ofd);
> > +extern int copy_fd(int ifd, int ofd, struct strbuf *err);
> 
> It is not limited to this single function, but what contract do we
> envision this "error messages are given back to the caller via
> strbuf" convention should give between the callers and the callee?
> 
> For example, is it a bug in the callee to touch "err" when there is
> no error to report?  Another example is if we should allow the
> callers to pass NULL there when they do not care about the nature of
> the error (e.g. "git cmd -q").
> 
> There may be other rules we want to enforce consistently across
> functions that adopt this convention.

It seems like we are really re-designing the error-handling chain here.
Maybe it is worth taking a step back and thinking about our overall
strategy.

Why do we dislike errno? Because it is global state, and it is easy for
it to get stomped by unrelated operations.  Another reason is that it
carries no parameters. You see "ENOENT", but you do not have any context
(e.g., which file).

What's good about errno? It is machine-readable (i.e., callers can check
for ENOENT, as opposed to text in a strbuf). It does not require any
allocation. Besides making it slightly more robust, it removes any
responsibility for resource cleanup from the caller.  It's globalness is
also convenient; you do not need to add an extra parameter to each
function to handle errors.

So what are some alternatives?

Passing back "-errno" instead of "-1" helps with the stomping issue (and
without adding extra parameters). It also retains machine-readability
(which I'll call just readability from now on).  But it doesn't help
with context, or better messaging.

Your solution adds a strbuf. That helps with context and stomping, but
loses readability and adds allocation.

If we changed the strbuf to a fixed-size buffer, that would help the
allocation issue. Some messages might be truncated, but it seems
unlikely in practice. It still loses readability, though.

What about a struct that has an errno-like value _and_ a fixed-size
buffer? I'm thinking something like:

  struct error {
	int code;
	char msg[1024];
  };

  /* caller who wants to print the message; no need to free message */
  struct error err;
  if (copy_fd(from, to, &err))
	return error("%s", err.msg);

  /* caller who does not; they can pass NULL */
  if (copy_fd(from, to, NULL))
	return -1;

  /* or they can use it to grab the errno value */
  struct error err;
  if (copy_fd(from, to, &err)) {
	if (err.code == EPIPE)
		exit(141);
	return -1;
  }

  /* function which generates error */
  void read_foo(const char *fn, struct error *err)
  {
	if (open(fn, O_RDONLY))
		return mkerror(err, errno, "unable to open %s", fn);
	... do other stuff ...
	return 0;
  }

  /* helper function for generating errors */
  int mkerror(struct error *err, int code, const char *fmt, ...)
  {
	va_list ap;
	int len;

	if (!err)
		return;

	err->code = code;
	va_start(ap, fmt);
	len = vsnprintf(err->msg, sizeof(err->msg), fmt, ap);
	va_end(ap);

	if (code)
		snprintf(err->msg + len, sizeof(err->msg) - len,
			 ": %s", strerror(code));

	return -1;
  }

You can also take the machine-readable thing a step further, like:

  struct error {
	int code;
	char param1[1024];
	char param2[1024];
	/* 2 parameters should be big enough for anyone, right? */
  }

and then generate the message on the fly when printing. This gives the
callers more information. But it also means defining a constant for
"code" for every error message, which is annoying. Libraries often do
this, but I do not think we need to go that far here.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html