Re: [PATCH v2 2/7] update-ref: add support for symref-verify

Patrick Steinhardt <ps@xxxxxx> · Fri, 19 Apr 2024 11:40:30 +0200

On Fri, Apr 12, 2024 at 11:59:03AM +0200, Karthik Nayak wrote:
> From: Karthik Nayak <karthik.188@xxxxxxxxx>
> 
> In the previous commit, we added the required base for adding symref
> support in transactions provided by the 'git-update-ref(1)'. This commit
> introduces the 'symref-verify' command which is similar to the existing
> 'verify' command for regular refs.
> 
> The 'symref-verify' command allows users to verify if a provided <ref>
> contains the provided <old-ref> without changing the <ref>. If <old-ref>
> is not provided, the command will verify that the <ref> doesn't exist.
> Since we're checking for symbolic refs, this command will only work with
> the 'no-deref' mode. This is because any dereferenced symbolic ref will
> point to an object and not a ref and the regular 'verify' command can be
> used in such situations.
> 
> This commit adds all required helper functions required to also
> introduce the other symref commands, namely create, delete, and update.
> We also add tests to test the command in both the regular stdin mode and
> also with the '-z' flag.
> 
> When the user doesn't provide a <old-ref> we need to check that the
> provided <ref> doesn't exist. And to do this, we take over the existing
> understanding that <old-oid> when set to its zero value, it refers to
> the ref not existing. While this seems like a mix of contexts between
> using <*-ref> and <*-oid>, this actually works really well, especially
> considering the fact that we want to eventually also introduce
> 
>     symref-update SP <ref> SP <new-ref> [SP (<old-oid> | <old-rev>)] LF
> 
> and here, we'd allow the user to update a regular <ref> to a symref and
> use <old-oid> to check the <ref>'s oid. This can be extrapolated to the
> user using this to create a symref when provided a zero <old-oid>. Which
> will work given how we're setting it up.
> 
> We also disable the reference-transaction hook for symref-updates which
> will be tackled in its own commit.
> 
> Add required tests for 'symref-verify' while also adding reflog checks for
> the pre-existing 'verify' tests.
> 
> Signed-off-by: Karthik Nayak <karthik.188@xxxxxxxxx>
> ---
>  Documentation/git-update-ref.txt |  7 +++
>  builtin/update-ref.c             | 82 ++++++++++++++++++++++++++++----
>  refs.c                           | 37 +++++++++++---
>  refs.h                           |  1 +
>  refs/files-backend.c             | 46 +++++++++++++++++-
>  refs/refs-internal.h             |  7 +++
>  refs/reftable-backend.c          | 23 ++++++++-
>  t/t1400-update-ref.sh            | 80 ++++++++++++++++++++++++++++++-
>  8 files changed, 262 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/git-update-ref.txt b/Documentation/git-update-ref.txt
> index 374a2ebd2b..749aaa7892 100644
> --- a/Documentation/git-update-ref.txt
> +++ b/Documentation/git-update-ref.txt
> @@ -65,6 +65,7 @@ performs all modifications together.  Specify commands of the form:
>  	create SP <ref> SP <new-oid> LF
>  	delete SP <ref> [SP <old-oid>] LF
>  	verify SP <ref> [SP <old-oid>] LF
> +	symref-verify SP <ref> [SP <old-ref>] LF
>  	option SP <opt> LF
>  	start LF
>  	prepare LF
> @@ -86,6 +87,7 @@ quoting:
>  	create SP <ref> NUL <new-oid> NUL
>  	delete SP <ref> NUL [<old-oid>] NUL
>  	verify SP <ref> NUL [<old-oid>] NUL
> +	symref-verify SP <ref> [NUL <old-ref>] NUL
>  	option SP <opt> NUL
>  	start NUL
>  	prepare NUL
> @@ -117,6 +119,11 @@ verify::
>  	Verify <ref> against <old-oid> but do not change it.  If
>  	<old-oid> is zero or missing, the ref must not exist.
>  
> +symref-verify::
> +	Verify symbolic <ref> against <old-ref> but do not change it.
> +	If <old-ref> is missing, the ref must not exist.  Can only be
> +	used in `no-deref` mode.

Should this say "is zero or missing", like the comment for "verify"
does?

[snip]
> @@ -297,11 +321,48 @@ static void parse_cmd_verify(struct ref_transaction *transaction,
>  		die("verify %s: extra input: %s", refname, next);
>  
>  	if (ref_transaction_verify(transaction, refname, &old_oid,
> -				   update_flags, &err))
> +				   NULL, update_flags, &err))
> +		die("%s", err.buf);
> +
> +	update_flags = default_flags;
> +	free(refname);
> +	strbuf_release(&err);
> +}
> +
> +static void parse_cmd_symref_verify(struct ref_transaction *transaction,
> +                                    const char *next, const char *end)
> +{
> +	struct strbuf err = STRBUF_INIT;
> +	struct object_id old_oid;
> +	char *refname, *old_ref;
> +
> +	if (!(update_flags & REF_NO_DEREF))
> +		die("symref-verify: cannot operate with deref mode");

This feels quite restrictive to me. Wouldn't it be preferable to simply
ignore `REF_NO_DEREF` here? It basically means that this command can't
ever be used in a normal `git update-ref --stdin` session.

> +	refname = parse_refname(&next);
> +	if (!refname)
> +		die("symref-verify: missing <ref>");
> +
> +	/*
> +	 * old_ref is optional, but we want to differentiate between
> +	 * a NULL and zero value.
> +	 */
> +	old_ref = parse_next_refname(&next);
> +	if (!old_ref)
> +		old_oid = *null_oid();
> +	else if (read_ref(old_ref, NULL))
> +		die("symref-verify %s: invalid <old-ref>", refname);
> +
> +	if (*next != line_termination)
> +		die("symref-verify %s: extra input: %s", refname, next);
> +
> +	if (ref_transaction_verify(transaction, refname, old_ref ? NULL : &old_oid,
> +				   old_ref, update_flags | REF_SYMREF_UPDATE, &err))
>  		die("%s", err.buf);
>  
>  	update_flags = default_flags;
>  	free(refname);
> +	free(old_ref);
>  	strbuf_release(&err);
>  }
>  
> @@ -380,15 +441,16 @@ static const struct parse_cmd {
>  	unsigned args;
>  	enum update_refs_state state;
>  } command[] = {
> -	{ "update",  parse_cmd_update,  3, UPDATE_REFS_OPEN },
> -	{ "create",  parse_cmd_create,  2, UPDATE_REFS_OPEN },
> -	{ "delete",  parse_cmd_delete,  2, UPDATE_REFS_OPEN },
> -	{ "verify",  parse_cmd_verify,  2, UPDATE_REFS_OPEN },
> -	{ "option",  parse_cmd_option,  1, UPDATE_REFS_OPEN },
> -	{ "start",   parse_cmd_start,   0, UPDATE_REFS_STARTED },
> -	{ "prepare", parse_cmd_prepare, 0, UPDATE_REFS_PREPARED },
> -	{ "abort",   parse_cmd_abort,   0, UPDATE_REFS_CLOSED },
> -	{ "commit",  parse_cmd_commit,  0, UPDATE_REFS_CLOSED },
> +	{ "update",        parse_cmd_update,        3, UPDATE_REFS_OPEN },
> +	{ "create",        parse_cmd_create,        2, UPDATE_REFS_OPEN },
> +	{ "delete",        parse_cmd_delete,        2, UPDATE_REFS_OPEN },
> +	{ "verify",        parse_cmd_verify,        2, UPDATE_REFS_OPEN },
> +	{ "symref-verify", parse_cmd_symref_verify, 2, UPDATE_REFS_OPEN },
> +	{ "option",        parse_cmd_option,        1, UPDATE_REFS_OPEN },
> +	{ "start",         parse_cmd_start,         0, UPDATE_REFS_STARTED },
> +	{ "prepare",       parse_cmd_prepare,       0, UPDATE_REFS_PREPARED },
> +	{ "abort",         parse_cmd_abort,         0, UPDATE_REFS_CLOSED },
> +	{ "commit",        parse_cmd_commit,        0, UPDATE_REFS_CLOSED },
>  };
>  
>  static void update_refs_stdin(void)
> diff --git a/refs.c b/refs.c
> index 967c81167e..124b294c9f 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -19,6 +19,7 @@
>  #include "object-store-ll.h"
>  #include "object.h"
>  #include "path.h"
> +#include "string.h"
>  #include "tag.h"
>  #include "submodule.h"
>  #include "worktree.h"
> @@ -29,6 +30,7 @@
>  #include "date.h"
>  #include "commit.h"
>  #include "wildmatch.h"
> +#include "wrapper.h"
>  
>  /*
>   * List of all available backends
> @@ -1217,6 +1219,7 @@ void ref_transaction_free(struct ref_transaction *transaction)
>  
>  	for (i = 0; i < transaction->nr; i++) {
>  		free(transaction->updates[i]->msg);
> +		free((void *)transaction->updates[i]->old_ref);
>  		free(transaction->updates[i]);
>  	}
>  	free(transaction->updates);
> @@ -1242,10 +1245,19 @@ struct ref_update *ref_transaction_add_update(
>  
>  	update->flags = flags;
>  
> -	if (flags & REF_HAVE_NEW)
> -		oidcpy(&update->new_oid, new_oid);
> -	if (flags & REF_HAVE_OLD)
> -		oidcpy(&update->old_oid, old_oid);
> +	/*
> +	 * The ref values are to be considered over the oid values when we're
> +	 * doing symref operations.
> +	 */

I feel like this is a statement that should be backed up by a deeper
explanation of why that is. I'm still wondering here why we cannot
assert that the old value is an object ID when I want to update it to a
symref, or alternatively why it would even be possible to have both
`REF_SYMREF_UPDATE` and a set of other, incompatible fields set. It
feels like this should be a `BUG()` instead if this is supposedly an
unsupported configuration rather than silently ignoring it.

In any case, I feel like it would be easier to reason about if this was
introduced together with the actual user. As far as I can see this code
shouldn't ever be hit for "verify-symref", right? Currently, the reader
is forced to figure out what is and isn't related to the new command.

> +	if (update->flags & REF_SYMREF_UPDATE) {
> +		if (old_ref)
> +			update->old_ref = xstrdup(old_ref);
> +	} else {
> +		if (flags & REF_HAVE_NEW)
> +			oidcpy(&update->new_oid, new_oid);
> +		if (flags & REF_HAVE_OLD)
> +			oidcpy(&update->old_oid, old_oid);
> +	}
>  	update->msg = normalize_reflog_message(msg);
>  	return update;
>  }
> @@ -1280,6 +1292,7 @@ int ref_transaction_update(struct ref_transaction *transaction,
>  	flags &= REF_TRANSACTION_UPDATE_ALLOWED_FLAGS;
>  
>  	flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
> +	flags |= (new_ref ? REF_HAVE_NEW : 0) | (old_ref ? REF_HAVE_OLD : 0);
>  
>  	ref_transaction_add_update(transaction, refname, flags,
>  				   new_oid, old_oid, new_ref, old_ref, msg);
> @@ -1318,14 +1331,17 @@ int ref_transaction_delete(struct ref_transaction *transaction,
>  int ref_transaction_verify(struct ref_transaction *transaction,
>  			   const char *refname,
>  			   const struct object_id *old_oid,
> +			   const char *old_ref,
>  			   unsigned int flags,
>  			   struct strbuf *err)
>  {
> -	if (!old_oid)
> +	if (flags & REF_SYMREF_UPDATE && !old_ref && !old_oid)
> +		BUG("verify called with old_ref set to NULL");
> +	if (!(flags & REF_SYMREF_UPDATE) && !old_oid)
>  		BUG("verify called with old_oid set to NULL");
>  	return ref_transaction_update(transaction, refname,
>  				      NULL, old_oid,
> -				      NULL, NULL,
> +				      NULL, old_ref,
>  				      flags, NULL, err);
>  }
>  
> @@ -2342,6 +2358,9 @@ static int run_transaction_hook(struct ref_transaction *transaction,
>  	for (i = 0; i < transaction->nr; i++) {
>  		struct ref_update *update = transaction->updates[i];
>  
> +		if (update->flags & REF_SYMREF_UPDATE)
> +			continue;
> +
>  		strbuf_reset(&buf);
>  		strbuf_addf(&buf, "%s %s %s\n",
>  			    oid_to_hex(&update->old_oid),
> @@ -2795,3 +2814,9 @@ int copy_existing_ref(const char *oldref, const char *newref, const char *logmsg
>  {
>  	return refs_copy_existing_ref(get_main_ref_store(the_repository), oldref, newref, logmsg);
>  }
> +
> +int null_new_value(struct ref_update *update) {
> +	if (update->flags & REF_SYMREF_UPDATE && update->new_ref)
> +		return 0;
> +	return is_null_oid(&update->new_oid);
> +}
> diff --git a/refs.h b/refs.h
> index 645fe9fdb8..a988e672ff 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -772,6 +772,7 @@ int ref_transaction_delete(struct ref_transaction *transaction,
>  int ref_transaction_verify(struct ref_transaction *transaction,
>  			   const char *refname,
>  			   const struct object_id *old_oid,
> +			   const char *old_ref,
>  			   unsigned int flags,
>  			   struct strbuf *err);
>  
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index e4d0aa3d41..8421530bde 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2411,6 +2411,37 @@ static const char *original_update_refname(struct ref_update *update)
>  	return update->refname;
>  }
>  
> +/*
> + * Check whether the REF_HAVE_OLD and old_ref values stored in update
> + * are consistent with ref, which is the symbolic reference's current
> + * value. If everything is OK, return 0; otherwise, write an error
> + * message to err and return -1.
> + */
> +static int check_old_ref(struct ref_update *update, char *ref,
> +			 struct strbuf *err)
> +{
> +	if (!(update->flags & REF_HAVE_OLD) ||
> +	    !strcmp(update->old_ref, ref))
> +		return 0;
> +
> +	if (!strcmp(update->old_ref, ""))
> +		strbuf_addf(err, "cannot lock ref '%s': "
> +			    "reference already exists",
> +			    original_update_refname(update));
> +	else if (!strcmp(ref, ""))
> +		strbuf_addf(err, "cannot lock ref '%s': "
> +			    "reference is missing but expected %s",
> +			    original_update_refname(update),
> +			    update->old_ref);
> +	else
> +		strbuf_addf(err, "cannot lock ref '%s': "
> +			    "is at %s but expected %s",
> +			    original_update_refname(update),
> +			    ref, update->old_ref);
> +
> +	return -1;
> +}
> +
>  /*
>   * Check whether the REF_HAVE_OLD and old_oid values stored in update
>   * are consistent with oid, which is the reference's current value. If
> @@ -2464,8 +2495,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  			       struct strbuf *err)
>  {
>  	struct strbuf referent = STRBUF_INIT;
> -	int mustexist = (update->flags & REF_HAVE_OLD) &&
> -		!is_null_oid(&update->old_oid);
> +	int mustexist = (update->flags & REF_HAVE_OLD) && !is_null_oid(&update->old_oid);

This change is a no-op, right? If so, let's rather drop it.

>  	int ret = 0;
>  	struct ref_lock *lock;
>  
> @@ -2514,6 +2544,18 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  					ret = TRANSACTION_GENERIC_ERROR;
>  					goto out;
>  				}
> +			}
> +
> +			/*
> +			 * For symref verification, we need to check the referent value
> +			 * rather than the oid. If we're dealing with regular refs or we're
> +			 * verifying a dereferenced symref, we then check the oid.
> +			 */
> +			if (update->flags & REF_SYMREF_UPDATE && update->old_ref) {
> +				if (check_old_ref(update, referent.buf, err)) {
> +					ret = TRANSACTION_GENERIC_ERROR;
> +					goto out;
> +				}
>  			} else if (check_old_oid(update, &lock->old_oid, err)) {
>  				ret = TRANSACTION_GENERIC_ERROR;
>  				goto out;
> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index 4c5fe02687..21c6b940d8 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -749,4 +749,11 @@ void base_ref_store_init(struct ref_store *refs, struct repository *repo,
>   */
>  struct ref_store *maybe_debug_wrap_ref_store(const char *gitdir, struct ref_store *store);
>  
> +/*
> + * Helper function to check if the new value is null, this
> + * takes into consideration that the update could be a regular
> + * ref or a symbolic ref.
> + */
> +int null_new_value(struct ref_update *update);

When adding it to the header we should probably prefix this to avoid
name collisions. `ref_update_is_null_new_value()` might be a mouth full,
but feels preferable to me.

>  #endif /* REFS_REFS_INTERNAL_H */
> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index 6104471199..7a03922c7b 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -938,7 +938,28 @@ static int reftable_be_transaction_prepare(struct ref_store *ref_store,
>  		 * individual refs. But the error messages match what the files
>  		 * backend returns, which keeps our tests happy.
>  		 */
> -		if (u->flags & REF_HAVE_OLD && !oideq(&current_oid, &u->old_oid)) {
> +		if ((u->flags & REF_HAVE_OLD) &&
> +		    (u->flags & REF_SYMREF_UPDATE) &&
> +		    u->old_ref) {
> +			if   (strcmp(referent.buf, u->old_ref)) {

s/   / /

> +				if (!strcmp(u->old_ref, ""))
> +					strbuf_addf(err, "cannot lock ref '%s': "
> +						    "reference already exists",
> +						    original_update_refname(u));
> +				else if (!strcmp(referent.buf, ""))
> +					strbuf_addf(err, "cannot lock ref '%s': "
> +						    "reference is missing but expected %s",
> +						    original_update_refname(u),
> +						    u->old_ref);
> +				else
> +					strbuf_addf(err, "cannot lock ref '%s': "
> +						    "is at %s but expected %s",
> +						    original_update_refname(u),
> +						    referent.buf, u->old_ref);

I'd use better-matching error messages here. I know that we talk about
"cannot lock ref" in the next branch, as well. But the only reason we
did this is to have the same error messages as the "files" backend.
Semantically, those errors don't make much sense as the "reftable"
backend never locks specific refs, but only the complete stack.

> +				ret = -1;
> +				goto done;
> +			}
> +		} else if (u->flags & REF_HAVE_OLD && !oideq(&current_oid, &u->old_oid)) {
>  			if (is_null_oid(&u->old_oid))
>  				strbuf_addf(err, _("cannot lock ref '%s': "
>  					    "reference already exists"),
> diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
> index ec3443cc87..d8ffda4096 100755
> --- a/t/t1400-update-ref.sh
> +++ b/t/t1400-update-ref.sh
> @@ -890,17 +890,23 @@ test_expect_success 'stdin update/create/verify combination works' '
>  '
>  
>  test_expect_success 'stdin verify succeeds for correct value' '
> +	test-tool ref-store main for-each-reflog-ent $m >before &&
>  	git rev-parse $m >expect &&
>  	echo "verify $m $m" >stdin &&
>  	git update-ref --stdin <stdin &&
>  	git rev-parse $m >actual &&
> -	test_cmp expect actual
> +	test_cmp expect actual &&
> +	test-tool ref-store main for-each-reflog-ent $m >after &&
> +	test_cmp before after
>  '
>  
>  test_expect_success 'stdin verify succeeds for missing reference' '
> +	test-tool ref-store main for-each-reflog-ent $m >before &&
>  	echo "verify refs/heads/missing $Z" >stdin &&
>  	git update-ref --stdin <stdin &&
> -	test_must_fail git rev-parse --verify -q refs/heads/missing
> +	test_must_fail git rev-parse --verify -q refs/heads/missing &&
> +	test-tool ref-store main for-each-reflog-ent $m >after &&
> +	test_cmp before after
>  '

The updated tests merely assert that the refs didn't change, right?

>  test_expect_success 'stdin verify treats no value as missing' '
> @@ -1641,4 +1647,74 @@ test_expect_success PIPE 'transaction flushes status updates' '
>  	test_cmp expected actual
>  '
>  
> +create_stdin_buf ()
> +{

The curly brace should go on the same line as the function name.

> +	if test "$1" = "-z"
> +	then
> +		shift
> +		printf "$F" "$@" >stdin
> +	else
> +		echo "$@" >stdin
> +	fi
> +}
> +
> +for type in "" "-z"
> +do

We should probably indent all of the tests to make it easier to see that
they run in a loop.

Patrick

> +test_expect_success "stdin ${type} symref-verify fails without --no-deref" '
> +	git symbolic-ref refs/heads/symref $a &&
> +	create_stdin_buf ${type} "symref-verify refs/heads/symref" "$a" &&
> +	test_must_fail git update-ref --stdin ${type} <stdin 2>err &&
> +	grep "fatal: symref-verify: cannot operate with deref mode" err
> +'
> +
> +test_expect_success "stdin ${type} symref-verify fails with too many arguments" '
> +	create_stdin_buf ${type} "symref-verify refs/heads/symref" "$a" "$a" &&
> +	test_must_fail git update-ref --stdin ${type} --no-deref <stdin 2>err  &&
> +	if test "$type" = "-z"
> +	then
> +		grep "fatal: unknown command: $a" err
> +	else
> +		grep "fatal: symref-verify refs/heads/symref: extra input:  $a" err
> +	fi
> +'
> +
> +test_expect_success "stdin ${type} symref-verify succeeds for correct value" '
> +	git symbolic-ref refs/heads/symref >expect &&
> +	test-tool ref-store main for-each-reflog-ent refs/heads/symref >before &&
> +	create_stdin_buf ${type} "symref-verify refs/heads/symref" "$a" &&
> +	git update-ref --stdin ${type} --no-deref <stdin &&
> +	git symbolic-ref refs/heads/symref >actual &&
> +	test_cmp expect actual &&
> +	test-tool ref-store main for-each-reflog-ent refs/heads/symref >after &&
> +	test_cmp before after
> +'
> +
> +test_expect_success "stdin ${type} symref-verify succeeds for missing reference" '
> +	test-tool ref-store main for-each-reflog-ent refs/heads/symref >before &&
> +	create_stdin_buf ${type} "symref-verify refs/heads/missing" &&
> +	git update-ref --stdin ${type} --no-deref <stdin &&
> +	test_must_fail git rev-parse --verify -q refs/heads/missing &&
> +	test-tool ref-store main for-each-reflog-ent refs/heads/symref >after &&
> +	test_cmp before after
> +'
> +
> +test_expect_success "stdin ${type} symref-verify fails for wrong value" '
> +	git symbolic-ref refs/heads/symref >expect &&
> +	create_stdin_buf ${type} "symref-verify refs/heads/symref" "$b" &&
> +	test_must_fail git update-ref --stdin ${type} --no-deref <stdin &&
> +	git symbolic-ref refs/heads/symref >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success "stdin ${type} symref-verify fails for mistaken null value" '
> +	git symbolic-ref refs/heads/symref >expect &&
> +	create_stdin_buf ${type} "symref-verify refs/heads/symref" &&
> +	test_must_fail git update-ref --stdin ${type} --no-deref <stdin &&
> +	git symbolic-ref refs/heads/symref >actual &&
> +	test_cmp expect actual
> +'
> +
> +done
> +
>  test_done
> -- 
> 2.43.GIT
> 
Attachment:
signature.asc

Description: PGP signature