Re: [PATCH v4 4/5] Add reftable library

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 7, 2020 at 1:16 AM brian m. carlson
<sandals@xxxxxxxxxxxxxxxxxxxx> wrote:
> > +
> > +Bugfixes should be accompanied by a test and applied to upstream project at
> > +https://github.com/google/reftable.
>
> I think this particular statement may be problematic.  The upstream
> project requires a CLA, which is a non-starter for many folks.  I don't
> think we can expect folks who are working on Git to necessarily send
> patches upstream with that condition.
>
> For example, if I end up needing to patch this series to work with
> SHA-256, I'm happy for Google to have my changes under the BSD License,
> but I won't be signing a CLA.

The CLA requirement is something our Open Source lawyers insist on,
and it's not for me to renege on. AFAIU, the CLA is essentially a
legally sound formulation of "I'm happy for Google to have my changes
under the BSD License"

> > +#define SHA1_SIZE 20
> > +#define SHA256_SIZE 32
>
> I'd rather we didn't hard-code these values here.  If you need a size
> suitable for allocation, we have GIT_MAX_RAWSZ and GIT_MAX_HEXSZ.  Those
> are guaranteed to be the right size for any future hash.

this code is structured as a standalone library so the same code can
be integrated into libgit2 unchanged, and can be unit-tested without
needing complex scaffolding.

As a result some data (eg. SHA1_SIZE) is duplicate. It also comes with
its own strbuf library.

> > +     {
> > +             struct merged_table m = {
> > +                     .stack = stack,
> > +                     .stack_len = n,
> > +                     .min = first_min,
> > +                     .max = last_max,
> > +                     .hash_size = SHA1_SIZE,
>
> This is going to cause problems with SHA-256.  We'd want to write this
> as the_hash_algo->rawsz, since it can change at runtime.

As this is within the library, the hash size should be injected
somewhere. I don't see huge problems here (the hash size could be part
of the 'struct write_options'), but until we have properly defined how
reftable will mark the on-disk files as SHA-256, it will only supports
SHA1. I intend to work set the file format in motion, once this SHA1
version is accepted into git-core.

There was one question you might be able to answer, though. Can
repositories exist in dual hash configuration, i.e. where the ref
database must store both SHA1 and SHA256 hashes?

> > +static byte zero[SHA256_SIZE] = {};
>
> It would be better to refer to null_oid here.

See above.

>
> > +#ifndef REFTABLE_H
> > +#define REFTABLE_H
> > +
> > +#include "system.h"
> > +
> > +typedef uint8_t byte;
>
> I think we typically prefer writing "unsigned char" or "uint8_t" instead
> of "byte".
>
> > +typedef byte bool;
>
> I suspect this is going to cause a whole host of sadness if somebody
> ever includes stdbool.h, or if that header ever gets included by an
> OS-specific header, since bool is a #define constant for the built-in
> type _Bool.
>
> We typically use int for booleans, but I'm not opposed to using
> stdbool.h for it on those systems that we know have it (which, in 2020,
> is probably all of them).

fixed in https://github.com/google/reftable/commit/a0c794609f2ce8114f014f7260093d4ffdde0d5d

> > +struct writer *new_writer(int (*writer_func)(void *, byte *, int),
> > +                       void *writer_arg, struct write_options *opts)
> > +{
> > +     struct writer *wp = calloc(sizeof(struct writer), 1);
> > +     options_set_defaults(opts);
> > +     if (opts->block_size >= (1 << 24)) {
> > +             /* TODO - error return? */
> > +             abort();
> > +     }
> > +     wp->hash_size = SHA1_SIZE;
>
> Another SHA-256 problem.
>
> Once this series has good test coverage, I'm happy to send a patch that
> squashes in SHA-256 support if you don't want to implement it yourself

There is really nothing to implement. The C library will accept hashes
of arbitrary size (hence the hash_size field you commented on before),
but it just needs a tiny bit of plumbing for the library user to
specify the wanted hash size, and a change to the documented format to
store it in the file format header.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux