On Tue, May 24, 2011 at 05:46:32PM -0700, Jamey Sharp wrote: > Documentation/Makefile | 2 +- > Documentation/git-http-backend.txt | 4 +- > Documentation/gitvirtual.txt | 76 ++++++++++++++++++++++++++++++++ > contrib/completion/git-completion.bash | 2 +- Maybe it would make sense to mention your new options to upload-pack and receive-pack in their manpages; the description can be short, but refer the user to gitvirtual. > +Given many repositories with copies of the same objects (such as > +branches of the same source), sharing a common object store will avoid > +duplication. Alternates provide a single baseline, but don't handle > +ongoing activity in the various repositories. Furthermore, operations > +such as linkgit:git-gc[1] need to know about all of the refs. It's not quite true that alternates provide only a single baseline. They can be updated and objects consolidated over time (e.g., with a nightly repack). The problem is that they require management to do so (this is also a benefit, if you want a sharing policy besides "all repos have all objects"). > +linkgit:git-upload-pack[1] and linkgit:git-receive-pack[1] rewrite the > +names of refs and heads as specified by the --ref-prefix and --head > +options. For instance, --ref-prefix=`virtual/reponame/` will use > ++pass:[refs/virtual/reponame/heads/*]+ and > ++pass:[refs/virtual/reponame/tags/*]+. git-upload-pack and > +git-receive-pack will ignore any references that do not match the > +specified prefix. Thinking on the whole idea a bit more, is there a reason to restrict this to upload-pack and receive-pack? Sure, they are the most obvious places to use it for hosting, but might I not want to be able to do: cd /path/to/mega-repository.git git --ref-prefix=virtual/repo1 log master to do server-side scripting inside the virtual repos (or more likely, setting GIT_REF_PREFIX at the top of your script). > +The --ref-prefix and --head options provide quite a bit of flexibility > +in organizing the refs of virtual repositories within those of the > +underlying repository. In the absence of a strong reason to do > +otherwise, consider following these conventions: > + > +--ref-prefix=`virtual/reponame/`:: > + This puts refs under `refs/virtual/reponame/`, which avoids a > + namespace conflict between `reponame` and built-in ref > + directories such as `heads` and `tags`. > + > +--head=`virtual-HEAD/reponame`:: > + This puts HEADs under `virtual-HEAD/` to avoid namespace > + conflicts with top-level filenames in a git repository. I'm curious if you have a use for this much flexibility. In particular, why do the HEAD and refs prefixes need the ability to be separate? Also, what about other non-HEAD top-level refs? IOW, a true "virtual repository" to me would just be: GIT_REF_PREFIX=refs/virtual/repo1 and then _every_ ref resolution would just prefix that, whether it was in refs/ or not. So you would have: .git/refs/virtual/repo1/HEAD .git/refs/virtual/repo1/refs/heads/master .git/refs/virtual/repo1/refs/tags/v1.0 and so on. And this fits in with the idea of it not just being an upload-pack and receive-pack thing. I could do: GIT_REF_PREFIX=refs/virtual/repo1; export GIT_REF_PREFIX git fetch some-remote and it would write to .git/refs/virtual/repo1/FETCH_HEAD. So the virtual repository is basically just a "chroot" of the ref namespace. And it's dirt simple to implement, because you do the translation at the refs.c layer. > +SECURITY > +-------- > + > +Anyone with access to any virtual repository can potentially access > +objects from any other virtual repository stored in the same underlying > +repository. You can't directly say "give me object ABCD" if you don't > +have a ref to it, but you can do some other sneaky things like: > + > +. Claiming to push ABCD, at which point the server will optimize out the > + need for you to actually send it. Now you have a ref to ABCD and can > + fetch it (claiming not to have it, of course). > + > +. Requesting other refs, claiming that you have ABCD, at which point the > + server may generate deltas against ABCD. > + > +None of this causes a problem if you only host public repositories, or > +if everyone who may read one virtual repo may also read everything in > +every other virtual repo (for instance, if everyone in an organization > +has read permission to every repository). Well, this text is obviously correct and written by a very smart person. ;) You might want to mention that if you do need to handle these security concerns, then the alternates route, even though it creates more management headache, is going to be more flexible with respect to which objects are shared. In fact, given what I said at the very top of the email, I wonder if the documentation would be better structured as "here are two methods for sharing objects, here are reasons why you might choose one or the other, and here is how to use each". -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html