On Fri, Feb 7, 2020 at 1:16 AM brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: > > + > > +Bugfixes should be accompanied by a test and applied to upstream project at > > +https://github.com/google/reftable. > > I think this particular statement may be problematic. The upstream > project requires a CLA, which is a non-starter for many folks. I don't > think we can expect folks who are working on Git to necessarily send > patches upstream with that condition. > > For example, if I end up needing to patch this series to work with > SHA-256, I'm happy for Google to have my changes under the BSD License, > but I won't be signing a CLA. The CLA requirement is something our Open Source lawyers insist on, and it's not for me to renege on. AFAIU, the CLA is essentially a legally sound formulation of "I'm happy for Google to have my changes under the BSD License" > > +#define SHA1_SIZE 20 > > +#define SHA256_SIZE 32 > > I'd rather we didn't hard-code these values here. If you need a size > suitable for allocation, we have GIT_MAX_RAWSZ and GIT_MAX_HEXSZ. Those > are guaranteed to be the right size for any future hash. this code is structured as a standalone library so the same code can be integrated into libgit2 unchanged, and can be unit-tested without needing complex scaffolding. As a result some data (eg. SHA1_SIZE) is duplicate. It also comes with its own strbuf library. > > + { > > + struct merged_table m = { > > + .stack = stack, > > + .stack_len = n, > > + .min = first_min, > > + .max = last_max, > > + .hash_size = SHA1_SIZE, > > This is going to cause problems with SHA-256. We'd want to write this > as the_hash_algo->rawsz, since it can change at runtime. As this is within the library, the hash size should be injected somewhere. I don't see huge problems here (the hash size could be part of the 'struct write_options'), but until we have properly defined how reftable will mark the on-disk files as SHA-256, it will only supports SHA1. I intend to work set the file format in motion, once this SHA1 version is accepted into git-core. There was one question you might be able to answer, though. Can repositories exist in dual hash configuration, i.e. where the ref database must store both SHA1 and SHA256 hashes? > > +static byte zero[SHA256_SIZE] = {}; > > It would be better to refer to null_oid here. See above. > > > +#ifndef REFTABLE_H > > +#define REFTABLE_H > > + > > +#include "system.h" > > + > > +typedef uint8_t byte; > > I think we typically prefer writing "unsigned char" or "uint8_t" instead > of "byte". > > > +typedef byte bool; > > I suspect this is going to cause a whole host of sadness if somebody > ever includes stdbool.h, or if that header ever gets included by an > OS-specific header, since bool is a #define constant for the built-in > type _Bool. > > We typically use int for booleans, but I'm not opposed to using > stdbool.h for it on those systems that we know have it (which, in 2020, > is probably all of them). fixed in https://github.com/google/reftable/commit/a0c794609f2ce8114f014f7260093d4ffdde0d5d > > +struct writer *new_writer(int (*writer_func)(void *, byte *, int), > > + void *writer_arg, struct write_options *opts) > > +{ > > + struct writer *wp = calloc(sizeof(struct writer), 1); > > + options_set_defaults(opts); > > + if (opts->block_size >= (1 << 24)) { > > + /* TODO - error return? */ > > + abort(); > > + } > > + wp->hash_size = SHA1_SIZE; > > Another SHA-256 problem. > > Once this series has good test coverage, I'm happy to send a patch that > squashes in SHA-256 support if you don't want to implement it yourself There is really nothing to implement. The C library will accept hashes of arbitrary size (hence the hash_size field you commented on before), but it just needs a tiny bit of plumbing for the library user to specify the wanted hash size, and a change to the documented format to store it in the file format header. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado