"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Han-Wen Nienhuys <hanwen@xxxxxxxxxx> > > The reftable backend needs to know the hash algorithm for writing the > initialization hash table. > > The initial reftable contains a symref HEAD => "main" (or "master"), which is > agnostic to the size of hash value, but this is an exceptional circumstance, and > the reftable library does not cater for this exception. It insists that all > tables in the stack have a consistent format ID for the hash algorithm. > > Call set_repo_hash_algo directly after reading out GIT_DEFAULT_HASH. Seeing that there is no mention of the_repository in the entire init-db.c file, that the same information is in repo_fmt which is passed to the init-db.c::create_default_files() function, and that create_default_files() is where the initial set of refs is prepared in the current system, it is not clear why this patch is needed (i.e. why we consider the current code that has no mention of the_repository is wrong). Isn't it the matter of passing the hash taken from repo_fmt to the refs API to initialize that part of the repository, instead of relying on half-initialized state in the_repository? It's not like only the hash_algo member is yet to be prepared in the_repository instance at that point in the code. Most of the members are, except for a very few fields initialized by initialize_the_repository(), not filled in the codepath, no? So, this might have been the most convenient way to pass hash_algo down but the patch does not convince me that it is the best way. It may be, but it does not answer questions like "what makes us sure that hash-algo will stay to be the only thing we need to fill early in this codepath?" If the patch _were_ to fill many other members of the_repository to make us pretend as if we did the setup_git_dir() in the repository we just created, and the hash_algo is filled merely as part of it, then it would have been much more convincing that we are moving in the right direction. It would for example mean gitdir/commondir would be filled among other members, and the existing "mkdir()" to manually prepare "refs/" for files backend in init-db.c can probably be removed and instead would be done by calling a new entry-point that initializes the initial set of refs in the refs API. And that entry-point would either rely on the_repository or take a repository instance, and use hash_algo member of that structure to do its thing. So, I dunno. > diff --git a/builtin/init-db.c b/builtin/init-db.c > index 01bc648d41..5c8c67fec6 100644 > --- a/builtin/init-db.c > +++ b/builtin/init-db.c > @@ -391,6 +391,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash > die(_("unknown hash algorithm '%s'"), env); > repo_fmt->hash_algo = env_algo; > } > + repo_set_hash_algo(the_repository, repo_fmt->hash_algo); > } > > int init_db(const char *git_dir, const char *real_git_dir, > > base-commit: faefdd61ec7c7f6f3c8c9907891465ac9a2a1475