On Wed, May 04, 2022 at 03:00:35PM +0000, Michael Heemskerk via GitGitGadget wrote: > From: Michael Heemskerk <mheemskerk@xxxxxxxxxxxxx> > > When deleting refs from the loose-files refs backend, files_delete_refs > first rewrites packed-refs if any of the to-be-deleted refs were packed > and then removes loose refs. While making those changes, it invokes the > reference-transaction hook incorrectly; a single transaction is used to > prepare and commit the changes to packed-refs, followed by another > transaction per deleted ref, each with another prepared and committed > reference-transaction hook invocation. > > This behaviour is problematic for a number of reasons. First of all, > deleting a ref through `git branch -d` or `git tag -d` results in a > different sequence of reference-transaction callbacks than deleting the > same ref through `git update-ref`: > > - update-ref of a loose ref: aborted, prepared, committed > - update-ref of a packed ref: prepared, prepared, committed, commited > - branch -d: prepared, committed, aborted, prepared, committed > > The bigger problem is that the packed-refs update is committed before > the prepared reference-transaction invocation is done for the loose > ref. Returning a non-zero exit code from the second > reference-transaction callback leads to an invalid sequence of > reference-transaction callbacks: > > 1. prepared - hook returns 0, so the change is allowed to go through. > 2. committed - git informs the hook that the changes are now final, > but are they really? Any loose refs may still survive if the > subsequent prepared callback is canceled. > 3. aborted - 'fake' invocation from the packed-transaction because the > ref does not exist in packed-refs. > 4. prepared - hook returns 1, so the change should be blocked. > 5. aborted - git informs the hook that it has rolled back the change, > but it really hasn't; packed-refs has already been rewritten and > if the ref only existed as a packed ref, it is now gone. > > The changes to the reference-transaction invocations that were planned > for git 2.36 but have been reverted make the problem more pronounced. > Those changes suppress the 'fake' invocations for the packed-transaction > (invocations 1-3 in the above list). In that case, the prepared callback > in step 4 cannot prevent a ref from being deleted if it only existed in > packed-refs. > > To address the issue, the use a separate transactions to update the > packed and loose refs has been removed from files_delete_refs. Instead, > it now uses a single transaction, queues up the refs-to-be-deleted and > relies on the standard transaction mechanism to invoke the > reference-transaction hooks as expected. > > Signed-off-by: Michael Heemskerk <mheemskerk@xxxxxxxxxxxxx> > --- > refs: honor reference-transaction semantics when deleting refs > > When deleting refs from the loose-files refs backend, files_delete_refs > first rewrites packed-refs if any of the to-be-deleted refs were packed > and then removes loose refs. While making those changes, it invokes the > reference-transaction hook incorrectly; a single transaction is used to > prepare and commit the changes to packed-refs, followed by another > transaction per deleted ref, each with another prepared and committed > reference-transaction hook invocation. > > This behaviour is problematic for a number of reasons. First of all, > deleting a ref through git branch -d or git tag -d results in a > different sequence of reference-transaction callbacks than deleting the > same ref through git update-ref: > > * update-ref of a loose ref: aborted, prepared, committed > * update-ref of a packed ref: prepared, prepared, committed, commited > * branch -d: prepared, committed, aborted, prepared, committed > > The bigger problem is that the packed-refs update is committed before > the prepared reference-transaction invocation is done for the loose ref. > Returning a non-zero exit code from the second reference-transaction > callback leads to an invalid sequence of reference-transaction > callbacks: > > 1. prepared - hook returns 0, so the change is allowed to go through. > 2. committed - git informs the hook that the changes are now final, but > are they really? Any loose refs may still survive if the subsequent > prepared callback is canceled. > 3. aborted - 'fake' invocation from the packed-transaction because the > ref does not exist in packed-refs. > 4. prepared - hook returns 1, so the change should be blocked. > 5. aborted - git informs the hook that it has rolled back the change, > but it really hasn't; packed-refs has already been rewritten and if > the ref only existed as a packed ref, it is now gone. > > The changes to the reference-transaction invocations that were planned > for git 2.36 but have been reverted make the problem more pronounced. > Those changes suppress the 'fake' invocations for the packed-transaction > (invocations 1-3 in the above list). In that case, the prepared callback > in step 4 cannot prevent a ref from being deleted if it only existed in > packed-refs. > > To address the issue, the use a separate transactions to update the > packed and loose refs has been removed from files_delete_refs. Instead, > it now uses a single transaction, queues up the refs-to-be-deleted and > relies on the standard transaction mechanism to invoke the > reference-transaction hooks as expected. > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1228%2Fmheemskerk%2Ffiles-delete-ref-reference-transaction-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1228/mheemskerk/files-delete-ref-reference-transaction-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1228 > > refs/files-backend.c | 33 +++++++-------- > t/t1416-ref-transaction-hooks.sh | 70 ++++++++++++++++++++++++++++++++ > t/t5510-fetch.sh | 6 +-- > 3 files changed, 87 insertions(+), 22 deletions(-) > > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 8db7882aacb..5c23277eda7 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1265,44 +1265,39 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, > struct files_ref_store *refs = > files_downcast(ref_store, REF_STORE_WRITE, "delete_refs"); > struct strbuf err = STRBUF_INIT; > - int i, result = 0; > + int i; > + struct ref_transaction *transaction; > > if (!refnames->nr) > return 0; > > - if (packed_refs_lock(refs->packed_ref_store, 0, &err)) > - goto error; > - > - if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) { > - packed_refs_unlock(refs->packed_ref_store); > + transaction = ref_store_transaction_begin(&refs->base, &err); > + if (!transaction) > goto error; > - } > - > - packed_refs_unlock(refs->packed_ref_store); > > for (i = 0; i < refnames->nr; i++) { > const char *refname = refnames->items[i].string; > - > - if (refs_delete_ref(&refs->base, msg, refname, NULL, flags)) > - result |= error(_("could not remove reference %s"), refname); > + if (ref_transaction_delete(transaction, refname, NULL, > + flags, msg, &err)) > + goto error; > } > > + if (ref_transaction_commit(transaction, &err)) > + goto error; > + > + ref_transaction_free(transaction); > strbuf_release(&err); > - return result; > + return 0; > > error: > - /* > - * If we failed to rewrite the packed-refs file, then it is > - * unsafe to try to remove loose refs, because doing so might > - * expose an obsolete packed value for a reference that might > - * even point at an object that has been garbage collected. > - */ > if (refnames->nr == 1) > error(_("could not delete reference %s: %s"), > refnames->items[0].string, err.buf); > else > error(_("could not delete references: %s"), err.buf); > > + if (transaction) > + ref_transaction_free(transaction); > strbuf_release(&err); > return -1; > } I really like these changes given that they simplify things, but I wonder whether we can do them. In the preimage we're eagerly removing loose refs: any error encountered when deleting a reference is recorded, but we keep on trying to remove the other refs, as well. With the new behaviour we now create a single transaction for all refs and try to commit it. This also means that we'll abort the transaction when locking any of the refs fails, which is a change in behaviour. The current behaviour is explicitly documented in `refs.h:refs_delete_refs()`: /* * Delete the specified references. If there are any problems, emit * errors but attempt to keep going (i.e., the deletes are not done in * an all-or-nothing transaction). msg and flags are passed through to * ref_transaction_delete(). */ int refs_delete_refs(struct ref_store *refs, const char *msg, struct string_list *refnames, unsigned int flags); There are multiple callsites of this function via `delete_refs()`. Now honestly, most of these callsites look somewhat broken: - `bisect.c` simply does its best to clean up bisect state. This usecase looks fine to me. - `builtin/branch.c` reports the branches as deleted even if `delete_refs()` failed. - `builtin/remote.c` also misreports the deleted branches for the `prune` verb. The `rm` verb looks alright: if deletion of any branch failed then it doesn't prune the remote's config in the end and reports an error. - `builtin/fetch.c` also misreports deleted branches with `--prune`. So most of these commands incorrectly handle the case where only a subset of branches has been deleted. This raises the question whether the interface provided by `refs_delete_refs()` is actually sensible if it's so easy to get wrong. It doesn't even report which branches could be removed and which couldn't. Furthermore, the question is whether new backends like the reftable backend which write all refs into a single slice would actually even be in a position to efficiently retain semantics of this function. I'm torn. There are valid usecases for eagerly deleting refs even if a subset of deletions failed, making this change a tough sell, but most of the callsites don't actually handle this correctly in the first place. > diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh > index 27731722a5b..e3d4fe624f7 100755 > --- a/t/t1416-ref-transaction-hooks.sh > +++ b/t/t1416-ref-transaction-hooks.sh > @@ -133,4 +133,74 @@ test_expect_success 'interleaving hook calls succeed' ' > test_cmp expect target-repo.git/actual > ' > > +test_expect_success 'hook allows deleting loose ref if successful' ' > + test_when_finished "rm actual" && > + git branch to-be-deleted $PRE_OID && > + test_hook reference-transaction <<-\EOF && > + echo "$*" >>actual > + EOF > + cat >expect <<-EOF && > + aborted > + prepared > + committed > + EOF > + git branch -d to-be-deleted && > + test_cmp expect actual && > + test_must_fail git rev-parse refs/heads/to-be-deleted > +' > + > +test_expect_success 'hook allows deleting packed ref if successful' ' > + test_when_finished "rm actual" && > + git branch to-be-deleted $PRE_OID && > + git pack-refs --all --prune && > + test_hook reference-transaction <<-\EOF && > + echo "$*" >>actual > + EOF > + cat >expect <<-EOF && > + prepared > + prepared > + committed > + committed > + EOF > + git branch -d to-be-deleted && > + test_cmp expect actual && > + test_must_fail git rev-parse refs/heads/to-be-deleted > +' > + > +test_expect_success 'hook aborts deleting loose ref in prepared state' ' > + test_when_finished "rm actual" && > + test_when_finished "git branch -d to-be-deleted" && > + git branch to-be-deleted $PRE_OID && > + test_hook reference-transaction <<-\EOF && > + echo "$*" >>actual > + exit 1 > + EOF > + cat >expect <<-EOF && > + aborted > + prepared > + aborted > + EOF It's not really clear why we get the first "aborted" result here. I know that it is because we didn't queue up any refs to the packed backend, and thus we don't even try to prepare the transaction. But it's likely confusing to a reader and might thus warrant a comment. On the other hand this is going away anyway if and when my patch series lands again that stops calling the hook from the nested packed backend. > + test_must_fail git branch -d to-be-deleted && > + test_cmp expect actual && > + git rev-parse refs/heads/to-be-deleted > +' > + > +test_expect_success 'hook aborts deleting packed ref in prepared state' ' > + test_when_finished "rm actual" && > + test_when_finished "git branch -d to-be-deleted" && > + git branch to-be-deleted $PRE_OID && > + git pack-refs --all --prune && > + test_hook reference-transaction <<-\EOF && > + echo "$*" >>actual > + exit 1 > + EOF > + cat >expect <<-EOF && > + prepared > + aborted > + EOF > + test_must_fail git branch -d to-be-deleted && > + test_cmp expect actual && > + git rev-parse refs/heads/to-be-deleted > +' > + From your description one more interesting case is when the packed-refs transaction is committed, but the loose-refs backend is aborted. It's a bit harder to test given that we have no way to indicate what backend the reftx hook is being invoked from though. One thing that worries me is that these patches kind of set the current behaviour of driving the reftx hook via both packed and loose backend into stone. My patch series that got reverted is going to change that behaviour though so that we don't execute the hook from the packed backend, and consequentially we'd have to change all these tests again to match the new behaviour. This makes it a lot harder to argue though that we can safely switch to the new behaviour without breaking any assumptions when we even codified our current assumptions into tests. Taking a step back I wonder whether my previous approach to just hide the hook for the packed backend was the right thing to do though. An alternative would be to instead expose additional information to the hook so that it can decide by itself whether it wants to execute the hook or not. This could e.g. trivially be done by exposing a new "GIT_REFS_BACKEND" environment variable to the reftx hook that gets set to either "packed-refs", "loose-refs" or "reftables" depending on which backend is currently in use. Equipped with this information a hook script can then easily ignore any updates to the packed-refs file by itself without having to change the way it is invoked right now and thus we wouldn't regress any currently existing hooks. > test_done > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh > index 4620f0ca7fa..8b09a99c2e8 100755 > --- a/t/t5510-fetch.sh > +++ b/t/t5510-fetch.sh > @@ -169,10 +169,10 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' ' > git clone . prune-fail && > cd prune-fail && > git update-ref refs/remotes/origin/extrabranch main && > - : this will prevent --prune from locking packed-refs for deleting refs, but adding loose refs still succeeds && > - >.git/packed-refs.new && > + : this will prevent --prune from locking refs/remotes/origin/extra for deletion && > + >.git/refs/remotes/origin/extrabranch.lock && > > - test_must_fail git fetch --prune origin > + test_must_fail git fetch --prune origin > outputs 2> errors It would be nice to have an explanation why exactly this change is needed, and why it is fine that the visible behaviour changes here. Patrick > > test_expect_success 'fetch --atomic works with a single branch' ' > > base-commit: 0f828332d5ac36fc63b7d8202652efa152809856 > -- > gitgitgadget
Attachment:
signature.asc
Description: PGP signature