Junio C Hamano <gitster@xxxxxxxxx> writes: > Duy Nguyen <pclouds@xxxxxxxxx> writes: > >> What do you mean by "partial history"? Do we have dangling pointers >> after doing that commit walker? > > "^C" will leave the objects and it is safe because it will not > update refs. > > But your code that does not verify the full connectivity from such > an object (that is outside the transferred pack) to the rest of the > history will then make the resulting repository broken, if you > update your ref to point at such an object, no? Ading a single > has_sha1_file() only verifies that single object, not the history > behind it. Let's illustrate. Imagine your project as a whole has this history: ---A---B---C---D---E When you cloned, it only had up to A, so that is what you have. Then you try http walker, which slurps E, wants to go to D and dig down, but after fetching E, some trees and blobs in E, and fetching D, but before completing D' trees and blobs, your ISP cuts you off. You have these in your object store: ---A D---E but your ref is pointing at A, because we promise that we make sure full connectivity before we update the ref, and even if you have commits D and E, the associated trees, blobs, and commits behind this subpart of the history are missing. Now you try to fetch from another mirror over the pack transport. You advertise that you have A (but you do not advertise E, because your refs do not point at it), and you expect all objects that are listed in "rev-list --objects A..E" to be gien to you. Unfortunately, the mirror is broken, in that it only packs the objects that appear in "rev-list --objects A..B", but still tells you that it is sending objects to complete history leading to E. Your object store would have objects like this after this transfer: ---A---B D---E You may still have commits D and E unpruned, but are missing C, and trees and blobs that are needed in D or E. You have to walk the history from E and list the necessary objects yourself, running has_sha1_file() on all of them, to prove that you have everything needed, and the only things you can trust are your refs (in this case, A). If you verify that all the objects in the transferred pack are complete, and also verify that the object the transfer proposes to update your ref to is _in_ that pack, then you can detect a mirror that is broken and only sends objects for A..B, but that does not actually solve the issue. Imagine another broken mirror that instead sends objects for E. E, its trees and all its blobs may be in the pack and the only outgoing link from that pack may be a pointer out of E pointing at D. You happen to _have_ it in your object store, but that does not mean you can update your ref to E (remember, you do not have trees and blobs to complete D, or the history behind it). The current check is implemented in the way we currently do it, _not_ because we were stupid and did not realize the optimization possibility (in other words, what your patch proposes is not new), but because we rejected this approach because it was not safe. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html