Re: [PATCH] init-db: init the_repository->hash_algo early from GIT_DEFAULT_HASH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: Han-Wen Nienhuys <hanwen@xxxxxxxxxx>
>
> The reftable backend needs to know the hash algorithm for writing the
> initialization hash table.
>
> The initial reftable contains a symref HEAD => "main" (or "master"), which is
> agnostic to the size of hash value, but this is an exceptional circumstance, and
> the reftable library does not cater for this exception. It insists that all
> tables in the stack have a consistent format ID for the hash algorithm.
>
> Call set_repo_hash_algo directly after reading out GIT_DEFAULT_HASH.

Seeing that there is no mention of the_repository in the entire
init-db.c file, that the same information is in repo_fmt which is
passed to the init-db.c::create_default_files() function, and that
create_default_files() is where the initial set of refs is prepared
in the current system, it is not clear why this patch is needed
(i.e. why we consider the current code that has no mention of
the_repository is wrong).  Isn't it the matter of passing the hash
taken from repo_fmt to the refs API to initialize that part of the
repository, instead of relying on half-initialized state in
the_repository?  It's not like only the hash_algo member is yet to
be prepared in the_repository instance at that point in the code.
Most of the members are, except for a very few fields initialized by
initialize_the_repository(), not filled in the codepath, no?

So, this might have been the most convenient way to pass hash_algo
down but the patch does not convince me that it is the best way.  It
may be, but it does not answer questions like "what makes us sure
that hash-algo will stay to be the only thing we need to fill early
in this codepath?"

If the patch _were_ to fill many other members of the_repository to
make us pretend as if we did the setup_git_dir() in the repository
we just created, and the hash_algo is filled merely as part of it,
then it would have been much more convincing that we are moving in
the right direction.  It would for example mean gitdir/commondir
would be filled among other members, and the existing "mkdir()" to
manually prepare "refs/" for files backend in init-db.c can probably
be removed and instead would be done by calling a new entry-point
that initializes the initial set of refs in the refs API.  And that
entry-point would either rely on the_repository or take a repository
instance, and use hash_algo member of that structure to do its thing.

So, I dunno.

> diff --git a/builtin/init-db.c b/builtin/init-db.c
> index 01bc648d41..5c8c67fec6 100644
> --- a/builtin/init-db.c
> +++ b/builtin/init-db.c
> @@ -391,6 +391,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash
>  			die(_("unknown hash algorithm '%s'"), env);
>  		repo_fmt->hash_algo = env_algo;
>  	}
> +	repo_set_hash_algo(the_repository, repo_fmt->hash_algo);
>  }
>  
>  int init_db(const char *git_dir, const char *real_git_dir,
>
> base-commit: faefdd61ec7c7f6f3c8c9907891465ac9a2a1475



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux