Re: [PATCH 5/5] commit-graph-format.txt: fix "Hash Version" description

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 14 Aug 2020 at 14:37, Derrick Stolee <stolee@xxxxxxxxx> wrote:
>
> On 8/14/2020 8:21 AM, Martin Ågren wrote:
> > We say that value 1 means "SHA-1", but in fact, it means "whatever
> > the_hash_algo is", see commit c166599862 ("commit-graph: convert to
> > using the_hash_algo", 2018-11-14).
> >
> > Signed-off-by: Martin Ågren <martin.agren@xxxxxxxxx>
> > ---
> >  If we want to be more fine-grained in the future, we'll need to say,
> >  e.g., "2 means SHA-1, 3 means SHA-256" or, perhaps preferrably, bump the
> >  version number.
> >
> >  I wonder: Should we instead say "1 means SHA-1, 2 means SHA-256"? It
> >  could be implemented as "easily" as "if (value_from_header !=
> >  value_from_the_hash_algo) die(...);" for now. Might that pay off in the
> >  long run?
> >
> >  This relates to Stolee's "in a vacuum" comment [1] ... so maybe we're
> >  fine.
>
> I think that was the intention of the byte, but that is not what ended
> up happening.

When I wrote this patch, I did go with "fix <thing>" rather than
"document SHA-256". For the other patches, it's sort of obvious how
those formats are so old that it's no wonder they assumed SHA-1. But
here, we did go to some trouble to try and future-proof things and
already had NewHash in mind. So that calls for "fix <thing>". But I'm
more and more starting to think that it's the implementation that should
be fixed and that the docs should just be extended to add "2 means
SHA-256".

> If we want that to be the case, then we should do that
> work as part of the 2.29 cycle before we release with the ability to
> create SHA-256 repos (which will lock the commit-graph format for these
> repos).

Part of my reasoning behind [3] was that in exactly a situation like this,
we'd be able to say

  With extensions.objectFormat=sha256, Git 2.29-2.30 will barf at the
  commit-graph files that Git 2.31+ generate, and the other way around.
  Users will be able to remove "old" files and regenerate them, and
  shouldn't use a mixed environment.

and know that those users knew this might happen.

But certainly, if we can avoid it altogether by changing behavior
already in 2.29, that would be better.

[3] https://lore.kernel.org/git/20200806202358.2265705-1-martin.agren@xxxxxxxxx/

> (By "we" I mean that I would try to do this work in a way that minimizes
> conflicts with the current commit-graph work in flight [1] [2].)

None of those seems to touch `oid_version()`, so if we can just tweak
that function to return 1 or 2 instead of always 1, I guess that's one
way.

> [1] https://lore.kernel.org/git/pull.676.v2.git.1596941624.gitgitgadget@xxxxxxxxx/
>
> [2] https://lore.kernel.org/git/cover.1597178914.git.me@xxxxxxxxxxxx/
>
> >
> > -  1-byte Hash Version (1 = SHA-1)
> > -      We infer the hash length (H) from this value.
> > +  1-byte Hash Version (1 = SHA-1 in SHA-1 repo, SHA-256 in SHA-256 repo)
> > +      We infer the hash length (H) from the hash algo derived from this value.
>
> If we are _not_ changing the format to have a meaningful value in
> this byte, then this documentation should be updated to state that
> this byte must always have value 1, as it does not provide any
> information.

We could still go

  1 means whatever extensions.objectFormat says,
  2 means SHA-1,
  3 means SHA-256,
  ...

But maybe that would just be crazy.

Thanks for all your comments
Martin




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux