Re: [PATCH v7 06/13] pack-objects: move in_pack out of struct object_entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 24, 2018 at 07:33:46AM +0100, Nguyễn Thái Ngọc Duy wrote:

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index e1244918a5..b41610569e 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -29,6 +29,8 @@
>  #include "list.h"
>  #include "packfile.h"
>  
> +#define IN_PACK(obj) oe_in_pack(&to_pack, obj)

How come this one gets a macro, but the earlier conversions don't?

I guess the problem is that oe_in_pack() is defined in the generic
pack-objects.h, but &to_pack is only in builtin/pack-objects.c?

I wonder if it would be that bad to just say oe_in_pack(&to_pack, obj)
everywhere. It's longer, but it makes the code slightly less magical to
read.

> @@ -1074,7 +1076,7 @@ static void create_object_entry(const struct object_id *oid,
>  	else
>  		nr_result++;
>  	if (found_pack) {
> -		entry->in_pack = found_pack;
> +		oe_set_in_pack(&to_pack, entry, found_pack);
>  		entry->in_pack_offset = found_offset;
>  	}

it's funny to see in_pack as an external thing, but in_pack_offset still
in the struct. I guess there's nothing to be gained there, since the
offset really does need to be individual (and large).

> diff --git a/cache.h b/cache.h
> index 862bdff83a..b90feb3802 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1635,6 +1635,7 @@ extern struct packed_git {
>  	int index_version;
>  	time_t mtime;
>  	int pack_fd;
> +	int index;		/* for builtin/pack-objects.c */
>  	unsigned pack_local:1,
>  		 pack_keep:1,
>  		 freshened:1,

It's pretty gross to infect this global struct. But I'm not sure there's
an easier way to do it with constant-time lookups. You'd have to build
the packed_git index preemptively in pack-objects, and then always just
pass around the index numbers.  And even that is kind of dicey, since
the packed_git list can grow while we're running.

The alternative is a hash table mapping packed_git pointers into numeric
indices. Yuck.

> +static void prepare_in_pack_by_idx(struct packing_data *pdata)
> +{
> +	struct packed_git **mapping, *p;
> +	int cnt = 0, nr = 1 << OE_IN_PACK_BITS;
> +
> +	if (getenv("GIT_TEST_FULL_IN_PACK_ARRAY")) {
> +		/*
> +		 * leave in_pack_by_idx NULL to force in_pack[] to be
> +		 * used instead
> +		 */
> +		return;
> +	}
> +
> +	ALLOC_ARRAY(mapping, nr);
> +	mapping[cnt++] = NULL; /* zero index must be mapped to NULL */

Why? I guess because index==0 is a sentinel for "we're using the small
index numbers?"

> +	prepare_packed_git();
> +	for (p = packed_git; p; p = p->next, cnt++) {
> +		if (cnt == nr) {
> +			free(mapping);
> +			return;
> +		}
> +		p->index = cnt;
> +		mapping[cnt] = p;
> +	}
> +	pdata->in_pack_by_idx = mapping;
> +}

What happens if we later have to reprepare_packed_git() and end up with
more packs? We only call this for the first pack.

It may well be handled, but I'm having trouble following the code to see
if it is. And I doubt that case is covered by our test suite (since it
inherently involves a race).

>  /*
> + * The size of struct nearly determines pack-objects's memory
> + * consumption. This struct is packed tight for that reason. When you
> + * add or reorder something in this struct, think a bit about this.
> + *

It's funny to see this warning come in the middle. Should it be part of
the final struct reordering at the end?

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux