Re: [PATCH v2 3/8] refs: make errno output explicit for read_raw_ref_fn

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Thu, 01 Jul 2021 13:34:15 +0200

On Thu, Jun 10 2021, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <hanwen@xxxxxxxxxx>
>
> read_raw_ref_fn needs to supply a credible errno for a number of cases. These
> are primarily:
>
> 1) The files backend calls read_raw_ref from lock_raw_ref, and uses the
> resulting error codes to create/remove directories as needed.
>
> 2) ENOENT should be translated in a zero OID, optionally with REF_ISBROKEN set,
> returning the last successfully resolved symref. This is necessary so
> read_raw_ref("HEAD") on an empty repo returns refs/heads/main (or the default branch
> du-jour), and we know on which branch to create the first commit.
>
> Make this information flow explicit by adding a failure_errno to the signature
> of read_raw_ref. All errnos from the files backend are still propagated
> unchanged, even though inspection suggests only ENOTDIR, EISDIR and ENOENT are
> relevant.
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@xxxxxxxxxx>
> Reviewed-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
> ---
>  refs.c                |  8 ++++++--
>  refs/debug.c          |  4 ++--
>  refs/files-backend.c  | 24 ++++++++++++------------
>  refs/packed-backend.c |  8 ++++----
>  refs/refs-internal.h  | 17 +++++++++--------
>  5 files changed, 33 insertions(+), 28 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 8c9490235ea6..bebe3f584da7 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1675,13 +1675,17 @@ int refs_read_raw_ref(struct ref_store *ref_store,
>  		      const char *refname, struct object_id *oid,
>  		      struct strbuf *referent, unsigned int *type)
>  {
> +	int result, failure;

Style nit: Different variables should be on different lines, except
where they're the same, so "int i, j", not "int ret, i". Perhaps "ret"
and "errno" are similar enough, but I'd split them, just my 0.02.

Also nit: s/failure/failure_errno/, just like the function signature.

>  	if (!strcmp(refname, "FETCH_HEAD") || !strcmp(refname, "MERGE_HEAD")) {
>  		return refs_read_special_head(ref_store, refname, oid, referent,
>  					      type);
>  	}
>  
> -	return ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
> -					   type);
> +	failure = 0;
> +	result = ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
> +					     type, &failure);
> +	errno = failure;
> +	return result;
>  }

Is there some subtlety here where this isn't equivalent to the more
simple/straightforward:

	diff --git a/refs.c b/refs.c
	index bebe3f584da..49ab7555de9 100644
	--- a/refs.c
	+++ b/refs.c
	@@ -1675,17 +1675,14 @@ int refs_read_raw_ref(struct ref_store *ref_store,
	 		      const char *refname, struct object_id *oid,
	 		      struct strbuf *referent, unsigned int *type)
	 {
	-	int result, failure;
	 	if (!strcmp(refname, "FETCH_HEAD") || !strcmp(refname, "MERGE_HEAD")) {
	 		return refs_read_special_head(ref_store, refname, oid, referent,
	 					      type);
	 	}

	-	failure = 0;
	-	result = ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
	-					     type, &failure);
	-	errno = failure;
	-	return result;
	+	errno = 0;
	+	return ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
	+					     type, &errno);
	 }

	 /* This function needs to return a meaningful errno on failure */

In that case adding the "failure_errno" to the signature won't be needed
either, but I'm assuming we're heading towards some reftable end-state
where that makes more sense.

I can only imagine a case where we think files_read_raw_ref() would
encounter a new errno after it assigned to *failure_errno, which is just
a couple of strbuf_release() calls.

if that is a case we're worried about then like in my comment on 2/8
shouldn't we be explicitly checking for such a lost/different errno?
I.e. something like (should probably be less fatal than BUG(...):

	diff --git a/refs.c b/refs.c
	index bebe3f584da..9584ddae392 100644
	--- a/refs.c
	+++ b/refs.c
	@@ -1675,16 +1675,22 @@ int refs_read_raw_ref(struct ref_store *ref_store,
	 		      const char *refname, struct object_id *oid,
	 		      struct strbuf *referent, unsigned int *type)
	 {
	-	int result, failure;
	+	int result, failure_errno = 0;
	+
	 	if (!strcmp(refname, "FETCH_HEAD") || !strcmp(refname, "MERGE_HEAD")) {
	 		return refs_read_special_head(ref_store, refname, oid, referent,
	 					      type);
	 	}

	-	failure = 0;
	+	errno = 0;
	 	result = ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
	-					     type, &failure);
	-	errno = failure;
	+					     type, &failure_errno);
	+	if (errno)
	+		BUG("Got another errno from read_raw_ref()?: %s, failure errno: %s",
	+		    strerror(errno),
	+		    strerror(failure_errno));
	+	else
	+		errno = failure_errno;
	 	return result;
	 }

	diff --git a/refs/files-backend.c b/refs/files-backend.c
	index 8f969c8f711..57cfdf738da 100644
	--- a/refs/files-backend.c
	+++ b/refs/files-backend.c
	@@ -458,8 +458,10 @@ static int files_read_raw_ref(struct ref_store *ref_store, const char *refname,
	 	ret = parse_loose_ref_contents(buf, oid, referent, type);

	 out:
	-	if (failure_errno)
	+	if (failure_errno) {
	 		*failure_errno = errno;
	+		errno = 0;
	+	}
	 	strbuf_release(&sb_path);
	 	strbuf_release(&sb_contents);
	 	return ret;
	diff --git a/refs/packed-backend.c b/refs/packed-backend.c
	index a457f18e93c..4bcb4777f0f 100644
	--- a/refs/packed-backend.c
	+++ b/refs/packed-backend.c
	@@ -740,6 +740,7 @@ static int packed_read_raw_ref(struct ref_store *ref_store, const char *refname,
	 	if (!rec) {
	 		/* refname is not a packed reference. */
	 		*failure_errno = ENOENT;
	+		errno = 0;
	 		return -1;
	 	}

Which also passes all our tests, and the packed_read_raw_ref() suggests
how you may have come up with the current pattern.
> [...]
> - * Return 0 on success. If the ref doesn't exist, set errno to ENOENT
> - * and return -1. If the ref exists but is neither a symbolic ref nor
> - * an object ID, it is broken; set REF_ISBROKEN in type, and return -1
> - * (errno should not be ENOENT) If there is another error reading the
> - * ref, set errno appropriately and return -1.
> + * Return 0 on success. If the ref doesn't exist, set failure_errno to ENOENT
> + * and return -1. If the ref exists but is neither a symbolic ref nor an object
> + * ID, it is broken; set REF_ISBROKEN in type, and return -1 (failure_errno
> + * should not be ENOENT). The files backend may return EISDIR (if the ref name
> + * is a directory) and ENOTDIR (if a ref prefix is not a directory). If there is
> + * another error reading the ref, set failure_errno appropriately and return -1.
>   *

This documentation is a bit confusing in light of the above comments. I
assume that "set failure_errno appropriately" here means that it will do
it, but it's really worded like it's suggesting that you should do it,
and does the "another error" suggest that a caller may need to deal with
an arbitrary errno, not just the ENOENT|EISDIR|ENOTDIR?