[PATCH v3 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(This series is based on 14650065b7 (RelNotes/2.48.0: fix typos etc.,
2025-01-07)).

The bulk of this series is unchanged since last time, save for a couple
of typofixes on spots noticed by Peff and Patrick Steinhardt. More
importantly, it fixes hash_algo_by_ptr() when passing the unsafe SHA-1
variant.

There were a couple of other ideas floated around, but I don't think
they panned out as we had hoped in practice, so I think that this
version should be good to go.

The original cover letter is as follows:

------------

This series implements an idea discussed in [2] which suggests that we
introduce a way to access a wrapped version of a 'struct git_hash_algo'
which represents the unsafe variant of that algorithm, rather than
having individual unsafe_ functions (like unsafe_init_fn() versus
init_fn(), etc.).

This approach is relatively straightforward to implement, and removes a
significant deficiency in the original implementation of
unsafe/non-cryptographic hash functions by making it impossible to
switch between safe- and unsafe variants of hash functions. It also
cleans up the sha1-unsafe test helper's implementation by removing a
large number of "if (unsafe)"-style conditionals.

The series is laid out as follows:

  * The first two patches prepare the hashfile API for the upcoming
    change:

      csum-file: store the hash algorithm as a struct field
      csum-file.c: extract algop from hashfile_checksum_valid()

  * The next patch implements the new 'unsafe_hash_algo()' function at
    the heart of this series' approach:

      hash.h: introduce `unsafe_hash_algo()`

  * The next two patches convert existing callers to use the new
    'unsafe_hash_algo()' function, instead of switching between safe and
    unsafe_ variants of individual functions:

      csum-file.c: use unsafe_hash_algo()
      t/helper/test-hash.c: use unsafe_hash_algo()

  * The final patch drops the unsafe_ function variants following all
    callers being converted to use the new pattern:

      hash.h: drop unsafe_ function variants

Thanks in advance for your review!

[1]: https://lore.kernel.org/git/20241230-pks-meson-sha1-unsafe-v1-0-efb276e171f5@xxxxxx/
[2]: https://lore.kernel.org/git/20241107013915.GA961214@xxxxxxxxxxxxxxxxxxxxxxx/

Taylor Blau (8):
  t/helper/test-tool: implement sha1-unsafe helper
  csum-file: store the hash algorithm as a struct field
  csum-file.c: extract algop from hashfile_checksum_valid()
  hash.h: introduce `unsafe_hash_algo()`
  csum-file.c: use unsafe_hash_algo()
  t/helper/test-hash.c: use unsafe_hash_algo()
  csum-file: introduce hashfile_checkpoint_init()
  hash.h: drop unsafe_ function variants

 builtin/fast-import.c  |  2 +-
 bulk-checkin.c         |  9 ++++++---
 csum-file.c            | 40 +++++++++++++++++++++++++---------------
 csum-file.h            |  2 ++
 hash.h                 | 28 ++++++++++++----------------
 object-file.c          | 41 ++++++++++++++++++++++++++---------------
 t/helper/test-hash.c   |  4 +++-
 t/helper/test-sha1.c   |  7 ++++++-
 t/helper/test-sha1.sh  | 38 ++++++++++++++++++++++----------------
 t/helper/test-sha256.c |  2 +-
 t/helper/test-tool.c   |  1 +
 t/helper/test-tool.h   |  3 ++-
 12 files changed, 107 insertions(+), 70 deletions(-)

Range-diff against v2:
1:  4c1523a04f1 = 1:  ae6b8c75294 t/helper/test-tool: implement sha1-unsafe helper
2:  99cc44895b5 ! 2:  2b79c76e471 csum-file: store the hash algorithm as a struct field
    @@ Commit message
         csum-file: store the hash algorithm as a struct field
     
         Throughout the hashfile API, we rely on a reference to 'the_hash_algo',
    -    and call its _usnafe function variants directly.
    +    and call its _unsafe function variants directly.
     
         Prepare for a future change where we may use a different 'git_hash_algo'
         pointer (instead of just relying on 'the_hash_algo' throughout) by
3:  1ffab2f8289 = 3:  d7deb3f338e csum-file.c: extract algop from hashfile_checksum_valid()
4:  99dcbe2e716 ! 4:  b6b2bb2714f hash.h: introduce `unsafe_hash_algo()`
    @@ Commit message
             if (unsafe)
               algop = unsafe_hash_algo(algop);
     
    -        the_hash_algo->init_fn(...);
    -        the_hash_algo->update_fn(...);
    -        the_hash_algo->final_fn(...);
    +        algop->init_fn(...);
    +        algop->update_fn(...);
    +        algop->final_fn(...);
     
         This removes the existing shortcoming by no longer forcing the caller to
         "remember" which variant of the hash functions it wants to call, only to
    @@ Commit message
         functions, this too will go away after subsequent commits remove all
         direct calls to the unsafe_ variants.
     
    +    Note that hash_algo_by_ptr() needs an adjustment to allow passing in the
    +    unsafe variant of a hash function. All other query functions on the
    +    hash_algos array will continue to return the safe variants of any
    +    function.
    +
         Suggested-by: Jeff King <peff@xxxxxxxx>
         Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
     
    @@ hash.h: struct git_hash_algo {
      };
      extern const struct git_hash_algo hash_algos[GIT_HASH_NALGOS];
      
    -@@ hash.h: static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
    - 	return p - hash_algos;
    +@@ hash.h: int hash_algo_by_length(int len);
    + /* Identical, except for a pointer to struct git_hash_algo. */
    + static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
    + {
    +-	return p - hash_algos;
    ++	size_t i;
    ++	for (i = 0; i < GIT_HASH_NALGOS; i++) {
    ++		const struct git_hash_algo *algop = &hash_algos[i];
    ++		if (p == algop || (algop->unsafe && p == algop->unsafe))
    ++			return i;
    ++	}
    ++	return GIT_HASH_UNKNOWN;
      }
      
     +const struct git_hash_algo *unsafe_hash_algo(const struct git_hash_algo *algop);
5:  2dcc2aa6803 = 5:  ca67de80971 csum-file.c: use unsafe_hash_algo()
6:  a2b9ef03080 = 6:  21b175b07ff t/helper/test-hash.c: use unsafe_hash_algo()
7:  94c07fd8a55 = 7:  850d4f407db csum-file: introduce hashfile_checkpoint_init()
8:  f5579883816 ! 8:  0c4d006e6e8 hash.h: drop unsafe_ function variants
    @@ Commit message
     
         to
     
    -        unsafe_hash_algo(the_hash_algo)->unsafe_init_fn();
    +        unsafe_hash_algo(the_hash_algo)->init_fn();
     
         and similar, we can remove the scaffolding for the unsafe_ function
         variants and force callers to use the new unsafe_hash_algo() mechanic

base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
-- 
2.48.0.rc2.35.g0c4d006e6e8




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux