When git-fetch(1) isn't called with the `--atomic` flag, then each reference will be updated in a separate transaction. As a result, even if we failed to update a subset of transactions, the remaining refs will be modified after the command has finished. While this pattern is reasonably efficient with the files backend where we'd lock and write each file separately anyway, the upcoming reftable backend could handle such an update a lot more efficiently if it was batched into a single transaction given that it could then create a single new reftable slice instead of creating one slice per updated ref. While this inefficiency can be easily mitigated by using the `--atomic` flag, this flag cannot be used in contexts where we want partial-update semantics. Convert git-fetch(1) to always use a single reference transaction, regardless of whether it is called with `--atomic` or not. The only remaining difference between both modes is that with `--atomic` set, we'd abort the transaciton in case at least one reference cannot be updated. Note that this slightly changes semantics of git-fetch(1): if we hit any unexpected errors like the reference update racing with another process, then we'll now fail to update any references, too. Signed-off-by: Patrick Steinhardt <ps@xxxxxx> --- builtin/fetch.c | 78 ++++++++++++++++--------------------------------- 1 file changed, 25 insertions(+), 53 deletions(-) Hi, until now, we have been quite lenient with creating lots of reference transactions even though they could've been bundled together into a single transaction. After all, it didn't have much of a downside in most contexts with the files backend: we'd have to lock each loose ref separately anyway. I'd like to get some feedback on changing our stance here, due to multiple reasons: - The upcoming reftable backend will be more efficient if we use a single transaction to bundle together multiple ref updates given that it only needs to write one new slice instead of one per update. - Syncing refs to disk can be batched more efficiently if we bundle ref updates. See my initial patch series to implement fsync-ing refs [1] and Peff's benchmarks [2] demonstrating that fetches may become a lot slower. - The reference-transaction hook can be used more efficiently given that it would only need to execute twice, instead of twice per updated ref. It also has a more global view of what's happening. While this is a concern of mine, it's a non-reason in the context of the Git project given that we really ought not to change semantics only to appease this hook. With these reasons in mind, I'm wondering whether we want to accept refactorings which convert existing code to use batched reference transactions. While the potential performance improvements are a rather obvious upside, the downside is that it would definitely change the failure mode in many cases. The example I have here with git-fetch(1) would mean that if we were to race with any other process or if any other unexpected error occurs which leads us to die, then we'll not commit any change to disk. This can be seen as an improvement in consistency, but it can also be seen as a change which breaks current semantics of trying to do as much work as possible. I'd thus love to hear about any opinions on this topic. Patrick [1]: <cover.1636544377.git.ps@xxxxxx> [2]: <YYwvVy6AWDjkWazn@xxxxxxxxxxxxxxxxxxxxxxx> diff --git a/builtin/fetch.c b/builtin/fetch.c index f7abbc31ff..c4cfd55452 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -641,9 +641,6 @@ static struct ref *get_ref_map(struct remote *remote, return ref_map; } -#define STORE_REF_ERROR_OTHER 1 -#define STORE_REF_ERROR_DF_CONFLICT 2 - static int s_update_ref(const char *action, struct ref *ref, struct ref_transaction *transaction, @@ -651,7 +648,6 @@ static int s_update_ref(const char *action, { char *msg; char *rla = getenv("GIT_REFLOG_ACTION"); - struct ref_transaction *our_transaction = NULL; struct strbuf err = STRBUF_INIT; int ret; @@ -661,44 +657,12 @@ static int s_update_ref(const char *action, rla = default_rla.buf; msg = xstrfmt("%s: %s", rla, action); - /* - * If no transaction was passed to us, we manage the transaction - * ourselves. Otherwise, we trust the caller to handle the transaction - * lifecycle. - */ - if (!transaction) { - transaction = our_transaction = ref_transaction_begin(&err); - if (!transaction) { - ret = STORE_REF_ERROR_OTHER; - goto out; - } - } - ret = ref_transaction_update(transaction, ref->name, &ref->new_oid, check_old ? &ref->old_oid : NULL, 0, msg, &err); - if (ret) { - ret = STORE_REF_ERROR_OTHER; - goto out; - } - - if (our_transaction) { - switch (ref_transaction_commit(our_transaction, &err)) { - case 0: - break; - case TRANSACTION_NAME_CONFLICT: - ret = STORE_REF_ERROR_DF_CONFLICT; - goto out; - default: - ret = STORE_REF_ERROR_OTHER; - goto out; - } - } - -out: - ref_transaction_free(our_transaction); if (ret) error("%s", err.buf); + strbuf_release(&err); free(msg); return ret; @@ -1107,12 +1071,10 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, } } - if (atomic_fetch) { - transaction = ref_transaction_begin(&err); - if (!transaction) { - error("%s", err.buf); - goto abort; - } + transaction = ref_transaction_begin(&err); + if (!transaction) { + error("%s", err.buf); + goto abort; } prepare_format_display(ref_map); @@ -1229,21 +1191,31 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, } } - if (!rc && transaction) { - rc = ref_transaction_commit(transaction, &err); - if (rc) { - error("%s", err.buf); - goto abort; - } + if (rc && atomic_fetch) { + error(_("aborting reference updates because of atomic fetch")); + goto abort; } - if (!rc) - commit_fetch_head(&fetch_head); - - if (rc & STORE_REF_ERROR_DF_CONFLICT) + switch (ref_transaction_commit(transaction, &err)) { + case 0: + break; + case TRANSACTION_NAME_CONFLICT: error(_("some local refs could not be updated; try running\n" " 'git remote prune %s' to remove any old, conflicting " "branches"), remote_name); + rc = -1; + break; + default: + error("%s", err.buf); + rc = -1; + break; + } + + if (rc && atomic_fetch) + goto abort; + + if (!rc) + commit_fetch_head(&fetch_head); if (advice_enabled(ADVICE_FETCH_SHOW_FORCED_UPDATES)) { if (!fetch_show_forced_updates) { -- 2.34.1
Attachment:
signature.asc
Description: PGP signature