On 08/30/2013 08:12 PM, Brad King wrote: > Add 'struct ref_update' to encode the information needed to update or > delete a ref (name, new sha1, optional old sha1, no-deref flag). Add > function 'update_refs' accepting an array of updates to perform. First > sort the input array to order locks consistently everywhere and reject > multiple updates to the same ref. Then acquire locks on all refs with > verified old values. Then update or delete all refs accordingly. Fail > if any one lock cannot be obtained or any one old value does not match. > > Though the refs themeselves cannot be modified together in a single s/themeselves/themselves/ > atomic transaction, this function does enable some useful semantics. > For example, a caller may create a new branch starting from the head of > another branch and rewind the original branch at the same time. This > transfers ownership of commits between branches without risk of losing > commits added to the original branch by a concurrent process, or risk of > a concurrent process creating the new branch first. > > Signed-off-by: Brad King <brad.king@xxxxxxxxxxx> > --- > refs.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > refs.h | 14 ++++++++ > 2 files changed, 135 insertions(+) > > diff --git a/refs.c b/refs.c > index 3bcd26e..901a38a 100644 > --- a/refs.c > +++ b/refs.c > @@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char *refname, > return update_ref_write(action, refname, sha1, lock, onerr); > } > > +static int ref_update_compare(const void *r1, const void *r2) > +{ > + struct ref_update *u1 = (struct ref_update *)(r1); > + struct ref_update *u2 = (struct ref_update *)(r2); If you declare u1 and u2 to be "const struct ref_update *" (i.e., add "const"), then you have const correctness and don't need the explicit casts. (And the parentheses around r1 and r2 are superfluous in any case.) > + int ret; Style: we usually put a blank line between variable declarations and the first line of code. > + ret = strcmp(u1->ref_name, u2->ref_name); > + if (ret) > + return ret; > + ret = hashcmp(u1->new_sha1, u2->new_sha1); > + if (ret) > + return ret; > + ret = hashcmp(u1->old_sha1, u2->old_sha1); > + if (ret) > + return ret; > + ret = u1->flags - u2->flags; > + if (ret) > + return ret; > + return u1->have_old - u2->have_old; > +} Is there a need to compare more than ref_name? If two entries are found with the same name, then ref_update_reject_duplicates() will error out anyway. So the relative order among entries with the same name is irrelevant. I think it would be OK to return 0 for any entries with the same ref_name, even if they differ in other fields. > + > +static int ref_update_reject_duplicates(struct ref_update *updates, int n, > + enum action_on_err onerr) > +{ > + int i; > + for (i = 1; i < n; ++i) > + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name)) > + break; The error handling code could be right here instead of the "break" statement, removing the need for the "if" conditional. > + if (i < n) { > + const char *str = "Multiple updates for ref '%s' not allowed."; > + switch (onerr) { > + case MSG_ON_ERR: error(str, updates[i].ref_name); break; > + case DIE_ON_ERR: die(str, updates[i].ref_name); break; > + case QUIET_ON_ERR: break; > + } > + return 1; > + } > + return 0; > +} > + > +int update_refs(const char *action, const struct ref_update *updates_orig, > + int n, enum action_on_err onerr) > +{ > + int ret = 0, delnum = 0, i; > + struct ref_update *updates; > + int *types; > + struct ref_lock **locks; > + const char **delnames; > + > + if (!updates_orig || !n) > + return 0; > + > + /* Allocate work space: */ > + updates = xmalloc(sizeof(struct ref_update) * n); It seems preferred here to write updates = xmalloc(sizeof(*updates) * n); as this will continue to work if the type of updates is ever changed. Similarly for the next lines. > + types = xmalloc(sizeof(int) * n); > + locks = xmalloc(sizeof(struct ref_lock *) * n); > + delnames = xmalloc(sizeof(const char *) * n); An alternative to managing separate arrays to hold types and locks would be to include the scratch space in struct ref_update and document it "for internal use only; need not be initialized by caller". On the one hand it's ugly to cruft up the "interface" with internal implementation details; on the other hand there is precedent for this sort of thing (e.g., ref_lock::force_write or lock_file::on_list) and it would simplify the code. > + > + /* Copy, sort, and reject duplicate refs: */ > + memcpy(updates, updates_orig, sizeof(struct ref_update) * n); > + qsort(updates, n, sizeof(struct ref_update), ref_update_compare); You could save some space and memory shuffling (during memcpy() and qsort()) if you would declare "updates" to be an array of pointers to "struct ref_update" rather than an array of structs. Sorting could then be done by moving pointers around instead of moving the structs. This would also make it easier for update_refs() to pass information about the references back to its caller, should that ever be needed. But I suppose that n will usually be small, so this suggestion can be considered optional. > + if (ref_update_reject_duplicates(updates, n, onerr)) { > + free(updates); > + free(types); > + free(locks); > + free(delnames); > + return 1; > + } > + > + /* Acquire all locks while verifying old values: */ > + for (i = 0; i < n; ++i) { > + locks[i] = update_ref_lock(updates[i].ref_name, > + (updates[i].have_old ? > + updates[i].old_sha1 : NULL), > + updates[i].flags, > + &types[i], onerr); > + if (!locks[i]) > + break; The error handling code could go here in place of the "break", especially if it's only "ret = 1; goto label;" as suggested below. > + } > + > + /* Abort if we did not get all locks: */ > + if (i < n) { > + while (--i >= 0) > + unlock_ref(locks[i]); > + free(updates); > + free(types); > + free(locks); > + free(delnames); > + return 1; > + } > + > + /* Perform updates first so live commits remain referenced: */ > + for (i = 0; i < n; ++i) > + if (!is_null_sha1(updates[i].new_sha1)) { > + ret |= update_ref_write(action, > + updates[i].ref_name, > + updates[i].new_sha1, > + locks[i], onerr); > + locks[i] = 0; /* freed by update_ref_write */ > + } > + Hmmm, if one of the calls to update_ref_write() fails, would it be safer to abort the rest of the work (especially the reference deletions)? > + /* Perform deletes now that updates are safely completed: */ > + for (i = 0; i < n; ++i) > + if (locks[i]) { > + delnames[delnum++] = locks[i]->ref_name; > + ret |= delete_ref_loose(locks[i], types[i]); > + } > + ret |= repack_without_refs(delnames, delnum); > + for (i = 0; i < delnum; ++i) > + unlink_or_warn(git_path("logs/%s", delnames[i])); > + clear_loose_ref_cache(&ref_cache); > + for (i = 0; i < n; ++i) > + if (locks[i]) > + unlock_ref(locks[i]); > + > + free(updates); > + free(types); > + free(locks); > + free(delnames); > + return ret; > +} There's a lot of duplicated cleanup code in the function. If you put a label before the final for loop, and if you initialize the locks array to zeros (e.g., by using xcalloc()), then the three exits could all share the same code "ret = 1; goto cleanup;". > + > struct ref *find_ref_by_name(const struct ref *list, const char *name) > { > for ( ; list; list = list->next) > diff --git a/refs.h b/refs.h > index 2cd307a..a8a7cc6 100644 > --- a/refs.h > +++ b/refs.h > @@ -214,6 +214,20 @@ int update_ref(const char *action, const char *refname, > const unsigned char *sha1, const unsigned char *oldval, > int flags, enum action_on_err onerr); > > +struct ref_update { > + const char *ref_name; > + unsigned char new_sha1[20]; > + unsigned char old_sha1[20]; > + int flags; > + int have_old; > +}; Please document this structure, especially the relationship between have_old and old_sha1. > + > +/** > + * Lock all refs and then perform all modifications. > + */ > +int update_refs(const char *action, const struct ref_update *updates, > + int n, enum action_on_err onerr); > + > extern int parse_hide_refs_config(const char *var, const char *value, const char *); > extern int ref_is_hidden(const char *); > > Overall, thanks; it looks good. I think this change is useful by itself and is also a good start towards implementing true reference transactions (which I think will be necessary pretty soon, at least for some applications). I will write another email discussing how your changes are related to some changes that I have been working on lately. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html