On Sat, Jan 31, 2009 at 05:19:31PM -0800, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > - without either, copy alternates from origin, but _don't_ use > > alternates while cloning > > Are you talking about a local clone optimization that does hardlink from > the source repository? Sorry, I was wrong about what was happening. From reading James' posts and not doing any experimenting or looking, I had the impression that doing this: # plain repo mkdir repo1 && (cd repo1 && git init && echo content >file && git add . && git commit -m one) # repo with alternates, but extra content git clone -s repo1 repo2 && (cd repo2 && echo content >>file && git commit -a -m two) # clone of repo w/ alternates git clone repo2 repo3 would cause the final clone to set up the alternate to repo1, but still pull in the objects. But that isn't the case, of course. Either: 1. It is a local hardlink clone, in which case we just pull in the objects from repo2. 2. It isn't, in which case we don't copy over the alternates. > I am fairly certain that copying alternates from the source repository was > not an intended behaviour but was a consequence of lazy coding of how we > copy (or link) everything from it. The original was literally the simple > matter of: > > find objects ! -type d -print | cpio $cpio_quiet_flag -pumd$l "$GIT_DIR/" > > whose intention was to copy objects/?? and objects/pack/. and it wasn't > even part of the design consideration to worry about what would happen to > the alternates the source repository might have in objects/info/. Right, I think that is what is going on. And what I was suggesting in my other email is that it is actively harmful to have this behavior, because now repo3 depends on repo1, without the user having explicitly asked for such a relationship (and they might not even be aware of repo1). I was tempted to suggest avoiding copying the alternates from repo2 to repo3. But you can't do that: repo2 is _missing_ objects that repo3 won't have. Without the alternates file pointing to repo1, repo3 is corrupt. So simply avoiding copying the alternates file doesn't work; one would have to actually pull the missing objects in from the alternate before doing so. But actually, I think there is even more breakage in hardlinking the alternates file: alternates files can be relative paths. So if repo2 points to "../../../repo1/.git/objects" (which it doesn't in the example above, as "clone -s" uses absolute paths -- but it is easy enough to construct a broken case), then repo3 will gain that alternate pointer, but may be in a totally different directory where that relative path is broken. And then repo3 is corrupt. So the alternates must be copied and any relative paths munged for it to work reliably. The hardlink code operates by default because it was thought to be a safe optimization that couldn't bite people. But it interacts badly with the concept of alternates. So I think a sane fix would be to disable hardlinking if the parent repo is using alternates at all. Then a vanilla "git clone repo2 repo3" will do the safe but more costly behavior of actually copying the objects. If the user wants to accept the risks of alternates, then he can give "-s" explicitly, and git will track the alternates recursively through repo2 to repo1 at runtime. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html