Re: [PATCH v2 6/7] refs: implement partial reference transaction support

Phillip Wood <phillip.wood123@xxxxxxxxx> · Tue, 4 Mar 2025 10:31:23 +0000

Hi Karthik

On 03/03/2025 20:21, Karthik Nayak wrote:
Phillip Wood <phillip.wood123@xxxxxxxxx> writes:
On 25/02/2025 09:29, Karthik Nayak wrote:
Git's reference transactions are all-or-nothing: either all updates
succeed, or none do. While this atomic behavior is generally desirable,
it can be suboptimal especially when using the reftable backend, where
batching multiple reference updates into a single transaction is more
efficient than performing them sequentially.

Introduce partial transaction support with a new flag,
'REF_TRANSACTION_ALLOW_PARTIAL'. When enabled, this flag allows
individual reference updates that would typically cause the entire
transaction to fail due to non-system-related errors to be marked as
rejected while permitting other updates to proceed. Non-system-related
errors include issues caused by user-provided input values, whereas
system-related errors, such as I/O failures or memory issues, continue
to result in a full transaction failure. This approach enhances
flexibility while preserving transactional integrity where necessary.

The implementation introduces several key components:

    - Add 'rejection_err' field to struct `ref_update` to track failed
      updates with failure reason.

    - Modify reference backends (files, packed, reftable) to handle
      partial transactions by using `ref_transaction_set_rejected()`
      instead of failing the entire transaction when
      `REF_TRANSACTION_ALLOW_PARTIAL` is set.

    - Add `ref_transaction_for_each_rejected_update()` to let callers
      examine which updates were rejected and why.

I think this is a much better design. I wonder if we want to signal to
the caller of ref_transaction_commit() that there were ignored errors
rather than forcing them to call ref_transaction_for_each_rejected() to
find that out. Another possibility would be to call the callback from
ref_transaction_commit() but that would mean changing the signature of
ref_transaction_begin() to take the callback and user data when
REF_TRANSACTION_ALLOW_PARTIAL is passed.

Yes, I did toy around modifying `ref_transaction_*` at first, but I
think the current implementation is slightly better. Users of the ref
API do not have to worry about complexity of partial transactions unless
they really need to. So in that case, for most users, the API remains
simple and clean, and for specific users who do want partial transaction
support, they can activate it via the flag and use the iterator to
collect the rejections at the end.

That makes sense. I have a slight concern that iterating through the 
errors is O(number of ref updates) rather than O(number of errors). If 
we expect most updates to succeed that it is a shame to have to check 
them all just to see there were no errors. Maybe we could be store the 
errors in a separate list of (update-index, error) pairs to avoid that.

Best Wishes

Phillip