Re: [PATCH v4 4/5] Add reftable library

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-02-06 at 22:55:55, Han-Wen Nienhuys via GitGitGadget wrote:
> diff --git a/reftable/README.md b/reftable/README.md
> new file mode 100644
> index 0000000000..f527da0380
> --- /dev/null
> +++ b/reftable/README.md
> @@ -0,0 +1,19 @@
> +
> +The source code in this directory comes from https://github.com/google/reftable.
> +
> +The VERSION file keeps track of the current version of the reftable library.
> +
> +To update the library, do:
> +
> +   ((cd reftable-repo && git fetch origin && git checkout origin/master ) ||
> +    git clone https://github.com/google/reftable reftable-repo) && \
> +   cp reftable-repo/c/*.[ch] reftable/ && \
> +   cp reftable-repo/LICENSE reftable/ &&
> +   git --git-dir reftable-repo/.git show --no-patch origin/master \
> +    > reftable/VERSION && \
> +   echo '/* empty */' > reftable/config.h
> +   rm reftable/*_test.c reftable/test_framework.*
> +   git add reftable/*.[ch]
> +
> +Bugfixes should be accompanied by a test and applied to upstream project at
> +https://github.com/google/reftable.

I think this particular statement may be problematic.  The upstream
project requires a CLA, which is a non-starter for many folks.  I don't
think we can expect folks who are working on Git to necessarily send
patches upstream with that condition.

For example, if I end up needing to patch this series to work with
SHA-256, I'm happy for Google to have my changes under the BSD License,
but I won't be signing a CLA.

> +#define SHA1_SIZE 20
> +#define SHA256_SIZE 32

I'd rather we didn't hard-code these values here.  If you need a size
suitable for allocation, we have GIT_MAX_RAWSZ and GIT_MAX_HEXSZ.  Those
are guaranteed to be the right size for any future hash.

> +	{
> +		struct merged_table m = {
> +			.stack = stack,
> +			.stack_len = n,
> +			.min = first_min,
> +			.max = last_max,
> +			.hash_size = SHA1_SIZE,

This is going to cause problems with SHA-256.  We'd want to write this
as the_hash_algo->rawsz, since it can change at runtime.

> +static byte zero[SHA256_SIZE] = {};

It would be better to refer to null_oid here.

> +#ifndef REFTABLE_H
> +#define REFTABLE_H
> +
> +#include "system.h"
> +
> +typedef uint8_t byte;

I think we typically prefer writing "unsigned char" or "uint8_t" instead
of "byte".

> +typedef byte bool;

I suspect this is going to cause a whole host of sadness if somebody
ever includes stdbool.h, or if that header ever gets included by an
OS-specific header, since bool is a #define constant for the built-in
type _Bool.

We typically use int for booleans, but I'm not opposed to using
stdbool.h for it on those systems that we know have it (which, in 2020,
is probably all of them).

> +struct writer *new_writer(int (*writer_func)(void *, byte *, int),
> +			  void *writer_arg, struct write_options *opts)
> +{
> +	struct writer *wp = calloc(sizeof(struct writer), 1);
> +	options_set_defaults(opts);
> +	if (opts->block_size >= (1 << 24)) {
> +		/* TODO - error return? */
> +		abort();
> +	}
> +	wp->hash_size = SHA1_SIZE;

Another SHA-256 problem.

Once this series has good test coverage, I'm happy to send a patch that
squashes in SHA-256 support if you don't want to implement it yourself
and you CC me.  If you do want to implement it yourself, you can grab
the transition-stage-4 branch from https://github.com/bk2204/git.git,
rebase your series on top of it, and run the testsuite with
GIT_TEST_DEFAULT_HASH=sha256 to see what breaks.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux