Re: [RFC] fast-import: invalidate pack_id references after loosening

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 25, 2016 at 10:54:02PM +0000, Eric Wong wrote:

> When loosening a pack, the current pack_id gets reused when
> checkpointing and the import does not terminate.  This causes
> problems after checkpointing as the object table, branch, and
> tag lists still contains pre-checkpoint references to the
> recycled pack_id.
> 
> Merely clearing the object_table as suggested by Jeff King in
> http://mid.gmane.org/20160517121330.GA7346@xxxxxxxxxxxxxxxxxxxxx
> is insufficient as the marks set still contains references
> to object entries.
> 
> Wrong pack_id references branch and tags lists do not cause
> errors, but can lead to misleading crash reports and core dumps,
> so they are also invalidated.

Nicely explained.

> +static void invalidate_pack_id(unsigned int id)
> +{
> +	unsigned int h;
> +	unsigned long lu;
> +	struct tag *t;
> +
> +	for (h = 0; h < ARRAY_SIZE(object_table); h++) {
> +		struct object_entry *e;
> +
> +		for (e = object_table[h]; e; e = e->next)
> +			if (e->pack_id == id)
> +				e->pack_id = MAX_PACK_ID;
> +	}
> +
> +	for (lu = 0; lu < branch_table_sz; lu++) {
> +		struct branch *b;
> +
> +		for (b = branch_table[lu]; b; b = b->table_next_branch)
> +			if (b->pack_id == id)
> +				b->pack_id = MAX_PACK_ID;
> +	}
> +
> +	for (t = first_tag; t; t = t->next_tag)
> +		if (t->pack_id == id)
> +			t->pack_id = MAX_PACK_ID;
> +}

This looks pretty straightforward. I do notice that we never shrink the
number of items in the object table when checkpointing, and so our
linear walk will grow ever larger. So if one were to checkpoint every
k-th object, it makes the whole operation quadratic in the number of
objects (actually, O(n^2/k) but k is a constant).

That's probably OK in practice, as I think the actual pack-write already
does a linear walk of the object table to generate the pack index. So
presumably nobody checkpoints often enough for it to be a problem. And
the solution would be to keep a separate list of pointers to objects for
the current pack-id, which would trivially fix both this case and the
one in create_index().  So we can punt on it until somebody actually
runs into it, I think.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]