Patrick Steinhardt <ps@xxxxxx> writes: [snip] > diff --git a/refs.c b/refs.c > index 9b112b0527..f7c7765d23 100644 > --- a/refs.c > +++ b/refs.c > @@ -2570,3 +2570,308 @@ int ref_update_check_old_target(const char *referent, struct ref_update *update, > referent, update->old_target); > return -1; > } > + > +struct migration_data { > + struct ref_store *old_refs; > + struct ref_transaction *transaction; > + struct strbuf *errbuf; > +}; > + > +static int migrate_one_ref(const char *refname, const struct object_id *oid, > + int flags, void *cb_data) > +{ > + struct migration_data *data = cb_data; > + struct strbuf symref_target = STRBUF_INIT; > + int ret; > + > + if (flags & REF_ISSYMREF) { > + ret = refs_read_symbolic_ref(data->old_refs, refname, &symref_target); > + if (ret < 0) > + goto done; > + > + ret = ref_transaction_update(data->transaction, refname, NULL, null_oid(), > + symref_target.buf, NULL, > + REF_SKIP_CREATE_REFLOG | REF_NO_DEREF, NULL, data->errbuf); > + if (ret < 0) > + goto done; > + } else { > + ret = ref_transaction_create(data->transaction, refname, oid, > + REF_SKIP_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION, > + NULL, data->errbuf); > + if (ret < 0) > + goto done; > + } I was a little perplexed about the first scenario being `ref_transaction_update` and the second being `ref_transaction_create`, I then realized that this is because the latter doesn't support creating symrefs yet (changes in my series kn/update-ref-symref), makes sense to do it this way. [snip] > +int repo_migrate_ref_storage_format(struct repository *repo, > + enum ref_storage_format format, > + unsigned int flags, > + struct strbuf *errbuf) > +{ > + struct ref_store *old_refs = NULL, *new_refs = NULL; > + struct ref_transaction *transaction = NULL; > + struct strbuf buf = STRBUF_INIT; > + struct migration_data data; > + size_t reflog_count = 0; > + char *new_gitdir = NULL; > + int did_migrate_refs = 0; > + int ret; > + > + old_refs = get_main_ref_store(repo); Should we add a check to ensure the `old_refs->repo->ref_storage_format` and `format` are different? > + > + /* > + * We do not have any interfaces that would allow us to write many > + * reflog entries. Once we have them we can remove this restriction. > + */ > + if (refs_for_each_reflog(old_refs, count_reflogs, &reflog_count) < 0) { > + strbuf_addstr(errbuf, "cannot count reflogs"); > + ret = -1; > + goto done; > + } > + if (reflog_count) { > + strbuf_addstr(errbuf, "migrating reflogs is not supported yet"); > + ret = -1; > + goto done; > + } Isn't this restrictive? It would be nice to perhaps say "git refs migrate --ignore-reflogs", which could make it possible to not care about reflogs. But maybe that can be part of a follow up. > + /* > + * Worktrees complicate the migration because every worktree has a > + * separate ref storage. While it should be feasible to implement, this > + * is pushed out to a future iteration. > + * > + * TODO: we should really be passing the caller-provided repository to > + * `has_worktrees()`, but our worktree subsystem doesn't yet support > + * that. > + */ > + if (has_worktrees()) { > + strbuf_addstr(errbuf, "migrating repositories with worktrees is not supported yet"); > + ret = -1; > + goto done; > + } > + Same as above. > + /* > + * The overall logic looks like this: > + * > + * 1. Set up a new temporary directory and initialize it with the new > + * format. This is where all refs will be migrated into. > + * > + * 2. Enumerate all refs and write them into the new ref storage. > + * This operation is safe as we do not yet modify the main > + * repository. > + * > + * 3. If we're in dry-run mode then we are done and can hand over the > + * directory to the caller for inspection. If not, we now start > + * with the destructive part. > + * > + * 4. Delete the old ref storage from disk. As we have a copy of refs > + * in the new ref storage it's okay(ish) if we now get interrupted > + * as there is an equivalent copy of all refs available. > + * > + * 5. Move the new ref storage files into place. > + * > + * 6. Change the repository format to the new ref format. > + */ > + strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX"); > + new_gitdir = mkdtemp(xstrdup(buf.buf)); > + if (!new_gitdir) { > + strbuf_addf(errbuf, "cannot create migration directory: %s", > + strerror(errno)); > + ret = -1; > + goto done; > + } > + > + new_refs = ref_store_init(repo, format, new_gitdir, > + REF_STORE_ALL_CAPS); > + ret = ref_store_create_on_disk(new_refs, 0, errbuf); > + if (ret < 0) > + goto done; > + > + transaction = ref_store_transaction_begin(new_refs, errbuf); > + if (!transaction) > + goto done; > + > + data.old_refs = old_refs; > + data.transaction = transaction; > + data.errbuf = errbuf; > + > + /* > + * We need to use the internal `do_for_each_ref()` here so that we can > + * also include broken refs and symrefs. These would otherwise be > + * skipped silently. > + * > + * Ideally, we would do this call while locking the old ref storage > + * such that there cannot be any concurrent modifications. We do not > + * have the infra for that though, and the "files" backend does not > + * allow for a central lock due to its design. It's thus on the user to > + * ensure that there are no concurrent writes. > + */ > + ret = do_for_each_ref(old_refs, "", NULL, migrate_one_ref, 0, > + DO_FOR_EACH_INCLUDE_ROOT_REFS | DO_FOR_EACH_INCLUDE_BROKEN, > + &data); > + if (ret < 0) > + goto done; > + > + /* > + * TODO: we might want to migrate to `initial_ref_transaction_commit()` > + * here, which is more efficient for the files backend because it would > + * write new refs into the packed-refs file directly. At this point, > + * the files backend doesn't handle pseudo-refs and symrefs correctly > + * though, so this requires some more work. > + */ > + ret = ref_transaction_commit(transaction, errbuf); > + if (ret < 0) > + goto done; > + did_migrate_refs = 1; > + > + if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) { > + printf(_("Finished dry-run migration of refs, " > + "the result can be found at '%s'\n"), new_gitdir); > + ret = 0; > + goto done; > + } > + > + /* > + * Until now we were in the non-destructive phase, where we only > + * populated the new ref store. From hereon though we are about > + * to get hands by deleting the old ref store and then moving > + * the new one into place. > + * > + * Assuming that there were no concurrent writes, the new ref > + * store should have all information. So if we fail from hereon > + * we may be in an in-between state, but it would still be able > + * to recover by manually moving remaining files from the > + * temporary migration directory into place. > + */ This also means that the recovery would only be possible into the new format. Makes sense. [snip]
Attachment:
signature.asc
Description: PGP signature