Re: [PATCH 12/16] refs: always handle non-normal refs in files backend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/03/2015 01:35 AM, David Turner wrote:
> Always handle non-normal (per-worktree or pseudo) refs in the files
> backend instead of alternate backends.
> 
> Sometimes a ref transaction will update both a per-worktree ref and a
> normal ref.  For instance, an ordinary commit might update
> refs/heads/master and HEAD (or at least HEAD's reflog).
> 
> We handle three cases here:
> 
> 1. updates to normal refs continue to go through the chosen backend
> 
> 2. updates to non-normal refs with REF_NODEREF or to non-symbolic refs
> are moved to a separate files backend transaction.
> 
> 3. updates to symbolic refs are dereferenced to their base ref.  The
> update to the base ref then goes through the ordinary backend, while
> the files backend is directly called to update the symref's reflog.
> 
> Signed-off-by: David Turner <dturner@xxxxxxxxxxxxxxxx>
> ---
>  refs.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 139 insertions(+), 2 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 808053f..e48e43a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -9,6 +9,11 @@
>  #include "object.h"
>  #include "tag.h"
>  
> +const char split_transaction_fail_warning[] =
> +	"A ref transaction was split across two refs backends.  Part of the "
> +	"transaction succeeded, but then the update to the per-worktree refs "
> +	"failed.  Your repository may be in an inconsistent state.";
> +
>  /*
>   * We always have a files backend and it is the default.
>   */
> @@ -784,6 +789,13 @@ void ref_transaction_free(struct ref_transaction *transaction)
>  	free(transaction);
>  }
>  
> +static void add_update_obj(struct ref_transaction *transaction,
> +			   struct ref_update *update)
> +{
> +	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
> +	transaction->updates[transaction->nr++] = update;
> +}
> +
>  static struct ref_update *add_update(struct ref_transaction *transaction,
>  				     const char *refname)
>  {
> @@ -791,8 +803,7 @@ static struct ref_update *add_update(struct ref_transaction *transaction,
>  	struct ref_update *update = xcalloc(1, sizeof(*update) + len);
>  
>  	memcpy((char *)update->refname, refname, len); /* includes NUL */
> -	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
> -	transaction->updates[transaction->nr++] = update;
> +	add_update_obj(transaction, update);
>  	return update;
>  }
>  
> @@ -1130,11 +1141,87 @@ int refs_init_db(struct strbuf *err, int shared)
>  	return the_refs_backend->init_db(err, shared);
>  }
>  
> +/*
> + * Special case for non-normal refs.  For symbolic-refs when
> + * REF_NODEREF is not turned on, we dereference them here and replace
> + * updates to the symbolic refs with updates to the underlying ref.
> + * Then we do our own reflogging for the symbolic ref.
> + *
> + * We move other non-normal ref updates with into a specially-created
> + * files-backend transaction
> + */

Extra word? s/with//?

> +static int move_abnormal_ref_updates(struct ref_transaction *transaction,
> +				     struct ref_transaction *files_transaction,
> +				     struct string_list *symrefs)
> +{
> +	int i;
> +
> +	for (i = 0; i < transaction->nr; i++) {
> +		struct ref_update *update = transaction->updates[i];
> +		const char *resolved;
> +		int flags = 0;
> +		unsigned char sha1[20];
> +
> +		if (ref_type(update->refname) == REF_TYPE_NORMAL)
> +			continue;
> +
> +		resolved = resolve_ref_unsafe(update->refname, 0, sha1, &flags);
> +
> +		if (update->flags & REF_NODEREF || !(flags & REF_ISSYMREF)) {
> +			int last;
> +
> +			add_update_obj(files_transaction, update);
> +			/*
> +			 * Replace this transaction with the
> +			 * last transaction, removing it from
> +			 * the list of backend transactions
> +			 */
> +			last = --transaction->nr;
> +			transaction->updates[i] = transaction->updates[last];

The "last" temporary variable could be trivially inlined.

> +			continue;
> +		}
> +
> +		if (resolved) {
> +			struct ref_update *new_update;
> +			struct string_list_item *item;
> +
> +			if (ref_type(resolved) != REF_TYPE_NORMAL)
> +				die("Non-normal symbolic ref `%s` points to non-normal ref `%s`", update->refname, resolved);

We don't usually use backticks in error messages. Please use "'" instead.

Also, please store this error message into a "strbuf *err" and report it
via the usual mechanism.

> +			new_update = xmalloc(sizeof(*new_update) +
> +					     strlen(resolved) + 1);
> +			memcpy(new_update, update, sizeof(*update));

Wouldn't it be preferable to replace this messy replacement code
(including the memcpy(), which can't be checked by the type system) with
a call to ref_transaction_update() followed by moving the new update to
this position in the list and possibly tweaking some of its fields?

> +			if (update->flags & REF_HAVE_OLD &&
> +			    hashcmp(sha1, update->old_sha1)) {
> +				/* consistency check failed */
> +				free(new_update);
> +				return -1;

We need an error message to be reported in this case; i.e., via a
"struct strbuf *err" argument.

But actually, I don't understand why this check is needed here at all.
Isn't it redundant with a similar check that will be done later (and
properly, under lock) as part of the main ref_transaction_commit()?

> +			} else {
> +				hashcpy(update->old_sha1, sha1);
> +			}
> +
> +			strcpy((char *)new_update->refname, resolved);
> +			transaction->updates[i] = new_update;
> +
> +			item = string_list_append(symrefs, update->refname);
> +			item->util = new_update;
> +			free(update);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  int ref_transaction_commit(struct ref_transaction *transaction,
>  			   struct strbuf *err)
>  {
>  	int ret = -1;
>  	struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
> +	struct string_list files_affected_refnames = STRING_LIST_INIT_NODUP;
> +	struct string_list symrefs = STRING_LIST_INIT_DUP;
> +	struct string_list_item *item;
> +	struct ref_transaction *files_transaction = NULL;
>  
>  	assert(err);
>  
> @@ -1146,6 +1233,26 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  		return 0;
>  	}
>  
> +	if (the_refs_backend != &refs_be_files) {
> +		files_transaction = ref_transaction_begin(err);
> +		if (!files_transaction)
> +			die("%s", err->buf);

I think dying here is too abrupt. Some callers try to recover from a
failed ref_transaction_commit(). Couldn't you "goto done" and let the
caller deal with err?

> +		ret = move_abnormal_ref_updates(transaction, files_transaction,
> +						&symrefs);
> +		if (ret)
> +			goto done;
> +
> +		/* files backend commit */
> +		if (ref_update_reject_duplicates(files_transaction,
> +						 &files_affected_refnames,
> +						 err)) {
> +			ret = TRANSACTION_GENERIC_ERROR;
> +			goto done;
> +		}

Is it correct to reject_duplicates among "abnormal" references and
"normal" references separately? ***

> +	}
> +
> +	/* main backend commit */
>  	if (ref_update_reject_duplicates(transaction, &affected_refnames, err)) {
>  		ret = TRANSACTION_GENERIC_ERROR;
>  		goto done;
> @@ -1153,8 +1260,35 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  
>  	ret = the_refs_backend->transaction_commit(transaction,
>  						   &affected_refnames, err);
> +	if (ret)
> +		goto done;
> +
> +	if (the_refs_backend != &refs_be_files) {

This conditional would perhaps be more to the point if expressed as "if
(files_transaction)".

> +		ret = refs_be_files.transaction_commit(files_transaction,
> +						       &files_affected_refnames,
> +						       err);
> +		if (ret) {
> +			warning(split_transaction_fail_warning);
> +			goto done;
> +		}
> +
> +		/* reflogging for dereferenced symbolic refs */
> +		for_each_string_list_item(item, &symrefs) {
> +			struct ref_update *update = item->util;
> +			if (files_log_ref_write(item->string, update->old_sha1,
> +						update->new_sha1,
> +						update->msg, update->flags, err))
> +				warning("failed to log ref update for symref %s",
> +					item->string);
> +		}

I think this code is incorrect because it doesn't lock the symbolic
reference before modifying its reflog (though I seem to recall that the
old code was buggy in this respect, too).

I wonder whether it would be simpler overall to leave the ref_update for
the symbolic ref in the files_transaction, but set a special internal
internal flag like REF_LOG_ONLY which tells the usual transaction_commit
code to add a reflog entry for update->old_sha1 to update->new_sha1,
without actually changing the reference.

> +	}
> +
>  done:
>  	string_list_clear(&affected_refnames, 0);
> +	string_list_clear(&files_affected_refnames, 0);
> +	if (files_transaction)
> +		ref_transaction_free(files_transaction);
> +	string_list_clear(&symrefs, 0);
>  	return ret;
>  }
>  
> @@ -1210,6 +1344,9 @@ int peel_ref(const char *refname, unsigned char *sha1)
>  int create_symref(const char *ref_target, const char *refs_heads_master,
>  		  const char *logmsg)
>  {
> +	if (ref_type(ref_target) != REF_TYPE_NORMAL)
> +		return refs_be_files.create_symref(ref_target, refs_heads_master,
> +						   logmsg);
>  	return the_refs_backend->create_symref(ref_target, refs_heads_master,
>  					       logmsg);
>  }
> 

I very much like the idea of introducing special handling for symbolic
reference updates within a transaction. In fact, I think I would go even
farther:

Let's take the example of an update to HEAD, which currently points at
refs/heads/master. I think it would *always* be a good idea (i.e., even
when only the files backend is in use) to split that ref_update into two
ref_updates:

1. One to update refs/heads/master and add a reflog entry for that
   reference.

2. One to add a reflog entry for HEAD (i.e. using the new REF_LOG_ONLY
   flag suggested above).

Why?

* It ensures that both references are locked correctly while their
  reflogs are updated. (I believe the current code gets this wrong.)

* It improves the reject_duplicates coverage, which (I think) currently
  wouldn't detect the conflict between a direct update of
  refs/heads/master with a simultaneous update of the same reference
  via HEAD.

* It could later be generalized to an update that goes through multiple
  layers of symref indirection (though this would be a very low
  priority).

This might benefit the split-backend situation that you are implementing
here. You could first do the symref-splitting step I just described, and
*then* separate the non-normal from the normal refs. I think the net
result would be simpler.

This patch is a lot to digest. I'm not yet confident that I have thought
through all of the ramifications of this patch. I guess a few iterations
will be needed in any case.

By the way, all of the patches preceding this one that I haven't
commented on look OK to me.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]