Re: [PATCH 2/2] Revert "repository: pre-initialize hash algo pointer"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 23, 2018 at 1:56 AM, Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> wrote:
> This reverts commit e26f7f19b6c7485f04234946a59ab8f4fd21d6d1. The root
> problem, git clone not setting up the_hash_algo, has been fixed in the
> previous patch.
>
> As a result of the revert, some code paths that use the_hash_algo
> without initialization is revealed and fixed here. It's basically
> commands that are allowed to run without a repository. The fix here is
> not the best. We probably could figure out the hash algorithm from input
> somehow.
>
> Since this is a dangerous move and could potentially break stuff after
> release (and leads to workaround like the reverted commit), the
> workaround technically remains, but is hidden behind a new environment
> variable GIT_HASH_FIXUP. This should let the users continue to use git
> while we fix the problem.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>
> ---
>  builtin/index-pack.c             | 5 +++++
>  common-main.c                    | 4 ++++
>  diff-no-index.c                  | 5 +++++
>  repository.c                     | 2 +-
>  t/helper/test-dump-split-index.c | 2 ++
>  5 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 7e3e1a461c..8ee935504e 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1673,6 +1673,11 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
>         if (prefix && chdir(prefix))
>                 die(_("Cannot come back to cwd"));
>
> +       if (!the_hash_algo) {
> +               warning(_("Running without a repository, assuming SHA-1 hash"));
> +               repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> +       }
> +
>         for (i = 1; i < argc; i++) {
>                 const char *arg = argv[i];
>
> diff --git a/common-main.c b/common-main.c
> index 6a689007e7..12aec36794 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -1,6 +1,7 @@
>  #include "cache.h"
>  #include "exec_cmd.h"
>  #include "attr.h"
> +#include "repository.h"
>
>  /*
>   * Many parts of Git have subprograms communicate via pipe, expect the
> @@ -40,5 +41,8 @@ int main(int argc, const char **argv)
>
>         restore_sigpipe_to_default();
>
> +       if (getenv("GIT_HASH_FIXUP"))
> +               repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> +
>         return cmd_main(argc, argv);
>  }
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 0ed5f0f496..f038f665bc 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -241,6 +241,11 @@ void diff_no_index(struct rev_info *revs,
>         struct strbuf replacement = STRBUF_INIT;
>         const char *prefix = revs->prefix;
>
> +       if (!the_hash_algo) {
> +               warning(_("Running without a repository, assuming SHA-1 hash"));
> +               repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> +       }
> +
>         diff_setup(&revs->diffopt);
>         for (i = 1; i < argc - 2; ) {
>                 int j;
> diff --git a/repository.c b/repository.c
> index 4ffbe9bc94..0d715f4fdb 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -5,7 +5,7 @@
>
>  /* The main repository */
>  static struct repository the_repo = {
> -       NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, &the_index, &hash_algos[GIT_HASH_SHA1], 0, 0
> +       NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, &the_index, NULL, 0, 0

I am not sure I agree with this defense in depth, because it would add
a lot to maintenance burden.
Specifically this part. The series that I sent out usually clashes
here as this is currently a hot area
of the code touched by many different series in flight.

However this is the long term correct thing to do? We assume no algorithm until
the repository can tell us from its config (or we default to sha1 if there is no
configuration present).

I wonder if there is yet another missing case in the enumeration of
the previous patch:
Some commands are able to operate on GIT_OBJECT_DIR instead
of GIT_DIR (git repack?), which may not even explore the full git directory,
and so doesn't know about the hash value.

In the cover letter you reference my series, but comparing the diffstats
(and looking through the patches), I would only expect this one place
to have merge conflicts, which ought to be easy to resolve.
(In my series I break the initializer into multiple lines to help the
future, too)

After some thought, I like this series.

Thanks,
Stefan




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux