Re: [PATCH] Fix "unpack-objects --strict"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 13, 2009 at 12:33:45PM -0700, Junio C Hamano wrote:
> diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
> index 557148a..109b7c8 100644
> --- a/builtin-unpack-objects.c
> +++ b/builtin-unpack-objects.c

What about this check:
> @@ -184,7 +184,7 @@ static int check_object(struct object *obj, int type, void *data)
>       if (!obj)
>  		return 0;

This is neccessary to skip already written objects (eg. blobs,
obj_list[i].obj == NULL). The return code is not important in this
case.

I'm not sure, if fsck_walk can call check_object with obj == NULL
under some (rare) conditions. If yes, the return code should be
changed to 1.

> We probably should write everything as they become unpackable (i.e. as
> their delta bases becomes available) while keeping track of object names
> (but not data) of structured objects that we received, and running only
> one level of reachability check on them at the end.  That would certainly
> reduce the memory consumption and may simplify the complexity of the code
> at the same time.

This would defeat the whole idea of the this check: If the
precondition (fsck returns OK for a repository) is met, unpack-objects
(and index-pack) with --strict should gurantee, that this is still
true after receiving the objects (even if somebody intentionally tries
to corrupt the repository).

As we assume, that all objects (and all their ancestors) already
present in the repository are OK, we only have to check new objects
and verify, that linked objects in the repository are of the correct
type.

As soon as we start to write objects without all linked objects
already present in the repository, the repository can get inconsistant.

Lets assume, unpack-objects/receive-pack is changed according to your proposal:
* writeout all objects, as received
* checking reachability of new objects
* update refs, if everything is OK

Then a corruption HOWTO would be:

To introduce a object with one of its linked objects missing, left it
out of the pack and push it into the repository. unpack-objects will
unpack all objects and fail updating the ref (but leave all objects in
the repository). As second step, simply send a ref update request,
which should succed, as the object is present in the repository.

A variant: set the SHA1 of the missing object to the SHA1 of a object
of another type. In the second step, you can sent this object to
unpack-objects, which will unpack it, as it does not know, that the
object is already referenced in the repository.

Deleting objects in the case of an error is also not an option, as a
parallel push operation could have already used the object.

mfg Martin Köger




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]