Re: [PATCH v2 6/7] refs: implement partial reference transaction support

Karthik Nayak <karthik.188@xxxxxxxxx> · Wed, 5 Mar 2025 15:20:21 +0100

Phillip Wood <phillip.wood123@xxxxxxxxx> writes:

> Hi Karthik
>
> On 03/03/2025 20:21, Karthik Nayak wrote:
>> Phillip Wood <phillip.wood123@xxxxxxxxx> writes:
>>> On 25/02/2025 09:29, Karthik Nayak wrote:
>>>> Git's reference transactions are all-or-nothing: either all updates
>>>> succeed, or none do. While this atomic behavior is generally desirable,
>>>> it can be suboptimal especially when using the reftable backend, where
>>>> batching multiple reference updates into a single transaction is more
>>>> efficient than performing them sequentially.
>>>>
>>>> Introduce partial transaction support with a new flag,
>>>> 'REF_TRANSACTION_ALLOW_PARTIAL'. When enabled, this flag allows
>>>> individual reference updates that would typically cause the entire
>>>> transaction to fail due to non-system-related errors to be marked as
>>>> rejected while permitting other updates to proceed. Non-system-related
>>>> errors include issues caused by user-provided input values, whereas
>>>> system-related errors, such as I/O failures or memory issues, continue
>>>> to result in a full transaction failure. This approach enhances
>>>> flexibility while preserving transactional integrity where necessary.
>>>>
>>>> The implementation introduces several key components:
>>>>
>>>>     - Add 'rejection_err' field to struct `ref_update` to track failed
>>>>       updates with failure reason.
>>>>
>>>>     - Modify reference backends (files, packed, reftable) to handle
>>>>       partial transactions by using `ref_transaction_set_rejected()`
>>>>       instead of failing the entire transaction when
>>>>       `REF_TRANSACTION_ALLOW_PARTIAL` is set.
>>>>
>>>>     - Add `ref_transaction_for_each_rejected_update()` to let callers
>>>>       examine which updates were rejected and why.
>>>
>>> I think this is a much better design. I wonder if we want to signal to
>>> the caller of ref_transaction_commit() that there were ignored errors
>>> rather than forcing them to call ref_transaction_for_each_rejected() to
>>> find that out. Another possibility would be to call the callback from
>>> ref_transaction_commit() but that would mean changing the signature of
>>> ref_transaction_begin() to take the callback and user data when
>>> REF_TRANSACTION_ALLOW_PARTIAL is passed.
>>>
>>
>> Yes, I did toy around modifying `ref_transaction_*` at first, but I
>> think the current implementation is slightly better. Users of the ref
>> API do not have to worry about complexity of partial transactions unless
>> they really need to. So in that case, for most users, the API remains
>> simple and clean, and for specific users who do want partial transaction
>> support, they can activate it via the flag and use the iterator to
>> collect the rejections at the end.
>
> That makes sense. I have a slight concern that iterating through the
> errors is O(number of ref updates) rather than O(number of errors). If
> we expect most updates to succeed that it is a shame to have to check
> them all just to see there were no errors. Maybe we could be store the
> errors in a separate list of (update-index, error) pairs to avoid that.
>

That is a good point, let me add in a array inside 'ref_transaction'
which will track rejected updates.

> Best Wishes
>
> Phillip

Thanks!
Attachment:
signature.asc

Description: PGP signature