Re: [PATCH v10 25/44] receive-pack.c: use a reference transaction for updating the refs

Ronnie Sahlberg <sahlberg@xxxxxxxxxx> · Mon, 19 May 2014 12:02:56 -0700

On Sat, May 17, 2014 at 8:35 AM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote:
> On 05/16/2014 07:37 PM, Ronnie Sahlberg wrote:
>> Wrap all the ref updates inside a transaction to make the update atomic.
>>
>> Signed-off-by: Ronnie Sahlberg <sahlberg@xxxxxxxxxx>
>> ---
>>  builtin/receive-pack.c | 20 ++++++++++----------
>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
>> index c323081..5534138 100644
>> --- a/builtin/receive-pack.c
>> +++ b/builtin/receive-pack.c
>> @@ -46,6 +46,8 @@ static void *head_name_to_free;
>>  static int sent_capabilities;
>>  static int shallow_update;
>>  static const char *alt_shallow_file;
>> +static struct strbuf err = STRBUF_INIT;
>> +static struct ref_transaction *transaction;
>>
>>  static enum deny_action parse_deny_action(const char *var, const char *value)
>>  {
>> @@ -475,7 +477,6 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>>       const char *namespaced_name;
>>       unsigned char *old_sha1 = cmd->old_sha1;
>>       unsigned char *new_sha1 = cmd->new_sha1;
>> -     struct ref_lock *lock;
>>
>>       /* only refs/... are allowed */
>>       if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) {
>> @@ -580,15 +581,9 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>>                   update_shallow_ref(cmd, si))
>>                       return "shallow error";
>>
>> -             lock = lock_any_ref_for_update(namespaced_name, old_sha1,
>> -                                            0, NULL);
>> -             if (!lock) {
>> -                     rp_error("failed to lock %s", name);
>> -                     return "failed to lock";
>> -             }
>> -             if (write_ref_sha1(lock, new_sha1, "push")) {
>> -                     return "failed to write"; /* error() already called */
>> -             }
>> +             if (ref_transaction_update(transaction, namespaced_name,
>> +                                        new_sha1, old_sha1, 0, 1, &err))
>> +                     return "failed to update";
>>               return NULL; /* good */
>>       }
>>  }
>> @@ -812,6 +807,7 @@ static void execute_commands(struct command *commands,
>>       head_name = head_name_to_free = resolve_refdup("HEAD", sha1, 0, NULL);
>>
>>       checked_connectivity = 1;
>> +     transaction = ref_transaction_begin();
>>       for (cmd = commands; cmd; cmd = cmd->next) {
>>               if (cmd->error_string)
>>                       continue;
>> @@ -827,6 +823,10 @@ static void execute_commands(struct command *commands,
>>                       checked_connectivity = 0;
>>               }
>>       }
>> +     if (ref_transaction_commit(transaction, "push", &err))
>> +             error("%s", err.buf);
>> +     ref_transaction_free(transaction);
>> +     strbuf_release(&err);
>>
>>       if (shallow_update && !checked_connectivity)
>>               error("BUG: run 'git fsck' for safety.\n"
>>
>
> This patch is strange, because even if one ref_transaction_update() call
> fails, subsequent updates are nevertheless also attempted, and the
> ref_transaction_commit() is also attempted.  Is this an officially
> sanctioned use of the ref_transactions API?  Should it be?

I think it should be supported. Because otherwise, unless you have the
entire transaction localized in a single block you would end up having
to check and recheck the return value everywhere.

It makes the API much easier to use if you can continue calling
transaction functions even after the transaction has failed. If the
transaction has already failed then _update/_create/_delete will do
nothing except return an error.

If _commit is called on a failed transaction then the commit will fail
with an error
and do nothing.

I think it is convenient, and it allows things like :

struct ref_transaction *transaction;
void foo()
{
   ...
   ref_transaction_update(transaction, ... , &err);
   ...
}

transaction = ref_transaction_begin(&err);
... doing stuff and call things that eventually ends up calling foo,
possible multiple times ...
ret = ref_transaction_commit(transaction, &err);

In foo() we ignore checking the return value so we will not see/care
if it failed. IF it fails however it will mark the transaction as
failed and update &err. (Note that this can not yet happen since
_update can not really fail, ever, but the next series will introduce
_update failures when we move locking there.)

Instead we can depend on that IF _update failed, then the call to
_commit will fail too and &err is already updated so we can defer any
checking for errors until _commit time.

This will make the API much more convenient for use cases where you
begin/commit the transaction in one function but the calls to
_update/_delete/_create are somewhere else, possible many function
calls away.
It does not mean that a caller must ignore the return value from
ref_transaction_update, just that the caller can do so and defer
checking for errors until later when it would be more convenient.

Please see current:
https://github.com/rsahlberg/git/tree/ref-transactions
and patch:
refs.c: add transaction.status and track OPEN/CLOSED/ERROR

  It might be
> a way to give feedback to the user on multiple attempted reference
> updates at once (i.e., address my comment about the last patch).
>
> If this is sanctioned, then it might be appropriate for the transaction
> to keep track of the fact that one or more reference updates failed, and
> when *_commit() is called to fail the whole transaction.

Yes. I updated refs.h to indicate that you can continue using
_update/_create/_delete even if a previous call has failed but that
these calls will now just return an error.

This does mean that on the first update that fails for a ref we fail
the transaction and abort any further _update calls to fail
immediately so if there would be additional refs that would fail we
would not log this. I think this is what we want to do since once we
have had a ref update fail it would be really hard to determine if the
next failure was just a side effect of the first failure or not.

>
> In any case, I think it is important to document, as part of the API
> docs, whether this is sanctioned or not, and if so, what exactly are its
> semantics.
>
> I've run out of time for today so I'm going to have to stop here.  FWIW
> patches 01-23 looked OK aside from the comments that I have made.
>
> Michael
>
> --
> Michael Haggerty
> mhagger@xxxxxxxxxxxx
> http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html