Re: [PATCH 5/6] refs: implement partial reference transaction support

Karthik Nayak <karthik.188@xxxxxxxxx> · Fri, 21 Feb 2025 05:33:38 -0500

Patrick Steinhardt <ps@xxxxxx> writes:

> On Fri, Feb 07, 2025 at 08:34:40AM +0100, Karthik Nayak wrote:
>> Git's reference transactions are all-or-nothing: either all updates
>> succeed, or none do. While this atomic behavior is generally desirable,
>> it can be suboptimal when using the reftable backend, where batching
>> multiple reference updates into a single transaction is more efficient
>> than performing them sequentially.
>
> In fact it's even inefficient for the "files" backend. The whole
> machinery around creating a new transaction, preparing it, committing it
> and then cleaning up its state does bring a bunch of overhead with it.
> But true, for the "reftable" backend it's way more impactful.
>
>> diff --git a/refs.c b/refs.c
>> index b420a120102b3793168598b885bba68e4f5f5f03..75dbd84acbc41658d4b8b6b5e7763c04e78d0061 100644
>> --- a/refs.c
>> +++ b/refs.c
>> @@ -1211,6 +1212,14 @@ void ref_transaction_free(struct ref_transaction *transaction)
>>  	free(transaction);
>>  }
>>
>> +void ref_transaction_add_rejection(struct ref_transaction *transaction,
>> +				   size_t update_idx, struct strbuf *err)
>
> "add" to me sounds like you're adding a new thingy to the transaction,
> but you rather update something. How about `ref_update_set_rejected()`
> or `ref_transacton_set_rejected()`?
>

Fair enough, I've changed it to `ref_transacton_set_rejected()`.

>> +{
>> +	struct ref_update *update = transaction->updates[update_idx];
>
> Do we want to `BUG()` in case `update_idx >= transaction->nr`?
>

Good point, let me add that in.

>> +	update->rejected = 1;
>> +	strbuf_addbuf(&update->rejection_err, err);
>> +}
>
> Do we really need a string as rejection error? I'd expect that the set
> of failures that lead to rejection should be rather limited, which means
> that we could use an enum instead. This would unify the errors across
> backends and also allows us to figure out the root cause of rejection in
> other subsystems.
>
> If we introduced an enum, we could eventually even iterate a bit on the
> mechanism and rather trivially tell the backends which kind of failures
> are acceptable. As an example, a conflicting ref update may for example
> be ignored and not cause failure, a conflicting path name might cause
> failure.
>

That's a good point, This also allows us to eventually extend the flag
to do something like you mentioned where `--allow-partial=all` would
skip all errors. But one could optimize to also say
`--allow-partial=name_conflict,old_value` to only skip errors due to
refname conflicts and invalid/incorrect old_value.

I'll add another commit to introduce and add 'enum transaction_error'
and build around it.

>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index 9fc5454678340dd7c72539bfa0f15ee7eb24b1ff..99ec29164fbd30635125cc2325aab3d300cf906c 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -2852,8 +2852,18 @@ static int files_transaction_prepare(struct ref_store *ref_store,
>>
>>  		ret = lock_ref_for_update(refs, update, transaction,
>>  					  head_ref, err);
>> -		if (ret)
>> +		if (ret) {
>
> I wonder whether we want to accept all failures. Some failures are
> certainly benign, like for example mismatching expected OIDs or a
> conflict due to a preexisting ref that blocks the path. But other kinds
> of failures which are unexpected might be a bit more on the dangerous
> side to accept, so I think we should be careful here.
>

Fair point. For the current implementation with the enum design
discussed above. I think it would be best to skip over all user
oriented errors. But any system errors would actually cease the
transaction. We can further iterate on this later.

> The same comment also applies to the other backends.
>
> Patrick
Attachment:
signature.asc

Description: PGP signature