On Mon, Apr 6, 2009 at 4:40 PM, Shawn O. Pearce <spearce@xxxxxxxxxxx> wrote: > The problem is, upload-pack won't perform a reachability analysis > to determine if a wanted SHA1 is reachable from a current ref. > Instead it requires that the wanted SHA1 is *exactly* referenced > by at least one ref. I probably just don't understand this properly, so please correct me as needed. My understanding is that * git-fetch-pack looks at the local named reference to figure out the SHA id "X" for the last locally available commit. * git-upload-pack is given "X" as a delimiter for what to include in the pack to send back to git-fetch-pack. So if I have "X" and I know which remote "Y" I want (because someone told me, or it's in a manifest), why shouldn't I be able to let git-upload-pack search for "X" from "Y" if that is exactly what it does anyway for named references? I accept that it may fail because "X" is not reachable from "Y" (just give me a sensible error message). > There's no reason to perform the reachability test on the server > when you can move it onto the client, and that's exactly what > git-submodule is doing. It fetches everything, and then assumes > its reachable post fetch. Since the client has fetched everything, > the client has the object if its reachable by the server. Except it will not always be available even when it was reachable at the source. Here's the real world example that forced me to reject the use of the submodule command for distributed setups: * Bob is located at site S where he sets up tree A with a submodule B. He uses "submodule init" to initialize B, which will cause it to be listed relative to S in A. * Lisa, at site T, clones A and updates the submodule B. No problem so far. Her list of submodules is inherited from S and works for updating B. * Lisa commits a new version of B and then a new version of A. Then she asks Kent to merge her changes. * Kent's clone will also have a submodules list that refers to site S (and not T). Running "submodule update" after fetching from T fails even though all the material is available at T, because Git is then trying to fetch the new revision of B from S. If you try to work around this by not using "submodule init", then you get a saner tree that can be worked on in a truly serverless fashion, like with plain git trees, but you have to implement a CM tool on top. > If the object is no longer reachable by the server's refs (think > branch rebased) then the object is actually in danger of being GC'd > off of the server's object store. This is alright and I would make sure all the refs I want to keep are reachable from named references to keep git-gc from chomping stuff in my local tree. In the remote tree, the unnamed reference is either available or it isn't. If someone made an unnamed reference unreachable and then garbage-collected it, well so be it. Just tell the user that the reference can't be found and may in fact not exist at all and you're done. No exhaustive search necessary. > One way we get away with this sort of thing in repo is, we only > put SHA1s in our manifest that are published in branches that > won't ever rewind or delete. Hence, its a moot point. What is the syntax for that? Anyway it's not a moot point. I may later want to use that revision of the manifest to perform a checkout on every component listed by the manifest. At that point I expect all the work trees to have exactly the contents they "should" have for that old version of the manifest. It's all about affordable reproducibility. /Klas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html