Re: [PATCH v3 12/12] hash: stop depending on `the_repository` in `null_oid()`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 08, 2025 at 08:05:09AM -0800, Elijah Newren wrote:
> On Fri, Mar 7, 2025 at 6:20 AM Patrick Steinhardt <ps@xxxxxx> wrote:
> >
> > The `null_oid()` function returns the object ID that only consists of
> > zeroes. Naturally, this ID also depends on the hash algorithm used, as
> > the number of zeroes is different between SHA1 and SHA256. Consequently,
> > the function returns the hash-algorithm-specific null object ID.
> >
> > This is currently done by depending on `the_hash_algo`, which implicitly
> > makes us depend on `the_repository`. Refactor the function to instead
> > pass in the hash algorithm for which we want to retrieve the null object
> > ID. Adapt callsites accordingly by passing in `the_repository`, thus
> > bubbling up the dependency on that global variable by one layer.
> >
> > There are a couple of trivial exceptions for subsystems that already got
> > rid of `the_repository`. These subsystems instead use the repository
> > that is available via the calling context:
> >
> >   - "builtin/grep.c"
> >   - "grep.c"
> >   - "refs/debug.c"
> >
> > There are also two non-trivial exceptions:
> >
> >   - "diff-no-index.c": Here we know that we may not have a repository
> >     initialized at all, so we cannot rely on `the_repository`. Instead,
> >     we adapt `diff_no_index()` to get a `struct git_hash_algo` as
> >     parameter. The only caller is located in "builtin/diff.c", where we
> >     know to call `repo_set_hash_algo()` in case we're running outside of
> >     a Git repository. Consequently, it is fine to continue passing
> >     `the_repository->hash_algo` even in this case.
> >
> >   - "builtin/ls-files.c": There is an in-flight patch series that drops
> >     `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
> >     conflict because we use `null_oid()` in `show_submodule()`. The
> >     value is passed to `repo_submodule_init()`, which may use the object
> >     ID to resolve a tree-ish in the superproject from which we want to
> >     read the submodule config. As such, the object ID should refer to an
> >     object in the superproject, and consequently we need to use its hash
> >     algorithm.
> >
> > This means that we could in theory just not bother about this edge case
> > at all and just use `the_repository` in "diff-no-index.c". But doing so
> > would feel misdesigned.
> 
> Very minor, but this was a bit jarring to me -- shouldn't this
> paragraph be indented over since it is a continuation of the second
> bullet above?

Yes, you're right!

> > @@ -4304,7 +4304,7 @@ static int process_entry(struct merge_options *opt,
> >                 /* Deleted on both sides */
> >                 ci->merged.is_null = 1;
> >                 ci->merged.result.mode = 0;
> > -               oidcpy(&ci->merged.result.oid, null_oid());
> > +               oidcpy(&ci->merged.result.oid, null_oid(the_hash_algo));
> >                 assert(!ci->df_conflict);
> >                 ci->merged.clean = !ci->path_conflict;
> >         }
> 
> What you have is an improvement since it's at least making things
> explicit, but these should really be opt->repo->hash_algo.

Oh, yeah, I suspect that we'll have a bunch of places where we can
already plug in a repository as available via the context. I typically
refrain from doing though unless really necessary (e.g. when the file at
hand doesn't declare `USE_THE_REPOSITORY_VARIABLE` anymore). This is so
that neither I nor reviewers have to carefully vet every callsite, but
to make this a mechanical change that is obviously correct because it is
identical to the old state. So I'm basically leaving it to the next
patch series that gets rid of `USE_THE_REPOSITORY_VARIABLE` to think
more carefully about those callsites.

Thanks!

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux