On Thu, Aug 13, 2009 at 12:33:45PM -0700, Junio C Hamano wrote: > diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c > index 557148a..109b7c8 100644 > --- a/builtin-unpack-objects.c > +++ b/builtin-unpack-objects.c What about this check: > @@ -184,7 +184,7 @@ static int check_object(struct object *obj, int type, void *data) > if (!obj) > return 0; This is neccessary to skip already written objects (eg. blobs, obj_list[i].obj == NULL). The return code is not important in this case. I'm not sure, if fsck_walk can call check_object with obj == NULL under some (rare) conditions. If yes, the return code should be changed to 1. > We probably should write everything as they become unpackable (i.e. as > their delta bases becomes available) while keeping track of object names > (but not data) of structured objects that we received, and running only > one level of reachability check on them at the end. That would certainly > reduce the memory consumption and may simplify the complexity of the code > at the same time. This would defeat the whole idea of the this check: If the precondition (fsck returns OK for a repository) is met, unpack-objects (and index-pack) with --strict should gurantee, that this is still true after receiving the objects (even if somebody intentionally tries to corrupt the repository). As we assume, that all objects (and all their ancestors) already present in the repository are OK, we only have to check new objects and verify, that linked objects in the repository are of the correct type. As soon as we start to write objects without all linked objects already present in the repository, the repository can get inconsistant. Lets assume, unpack-objects/receive-pack is changed according to your proposal: * writeout all objects, as received * checking reachability of new objects * update refs, if everything is OK Then a corruption HOWTO would be: To introduce a object with one of its linked objects missing, left it out of the pack and push it into the repository. unpack-objects will unpack all objects and fail updating the ref (but leave all objects in the repository). As second step, simply send a ref update request, which should succed, as the object is present in the repository. A variant: set the SHA1 of the missing object to the SHA1 of a object of another type. In the second step, you can sent this object to unpack-objects, which will unpack it, as it does not know, that the object is already referenced in the repository. Deleting objects in the case of an error is also not an option, as a parallel push operation could have already used the object. mfg Martin Köger -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html