On 01/19/2013 01:37 AM, Junio C Hamano wrote: > This is an early preview of reducing the network cost while talking > with a repository with tons of refs, most of which are of use by > very narrow audiences (e.g. refs under Gerrit's refs/changes/ are > useful only for people who are interested in the changes under > review). As long as these narrow audiences have a way to learn the > names of refs or objects pointed at by the refs out-of-band, it is > not necessary to advertise these refs. > > On the server end, you tell upload-pack that some refs do not have > to be advertised with the uploadPack.hiderefs multi-valued > configuration variable: > > [uploadPack] > hiderefs = refs/changes > > The changes necessary on the client side to allow fetching objects > at the tip of a ref in hidden hierarchies are much more involved and > not part of this early preview, but the end user UI is expected to > be like these: > > $ git fetch $there refs/changes/72/41672/1 > $ git fetch $there 9598d59cdc098c5d9094d68024475e2430343182 > > That is, you ask for a refname as usual even though it is not part > of ls-remote response, or you ask for the commit object that is at > the tip of whatever hidden ref you are interested in. Although I can understand the pain of slow network performance, somehow this proposal gives me the feeling of being expeditious rather than elegant. Could the problem be solved in some other way? Maybe such references could be stored in a second repository or in a separate namespace (in the sense of gitnamespaces(7)) to prevent their creating overhead when they are unneeded? And *if* reference hiding makes sense, it seems to me that the client, not the server, should be the one who decides which server references it is interested in (though I understand that would require a protocol change). Otherwise the git repository *relies* on out-of-band channels for its functionality. If I understand correctly, a user would have *no way* to discover, via git, what hidden references are contained in a remote repository, or indeed even that the repo contains a hidden namespace. For example this would make it impossible to clean up obsolete "hidden" references on a remote repository without the supplementary information stored elsewhere. And if anybody accidentally creates a reference in a hidden namespace by hand, it will just sit there undetectably, forever. I assume (though I've never checked) that a server does not let a client ask for a SHA1 that is not currently reachable from a server-side reference, and I assume that that you are not proposing to change this policy. But allowing objects to be fetched from a hidden reference opens up some "interesting" possibilities: * A pusher could upload arbitrary content to a public git server under a cryptic hidden reference name. Most people would be completely unable to see this content, unless given the SHA1 or the reference name by the pusher. Thus this mechanism could be used as a dark channel to exchange arbitrary data relatively secretly. * Somebody could push a trojan version of code to a hidden reference in a project, then pass the SHA1 to a victim. The victim might trust the code because it comes from a known project website, even though the code would be invisible to other project developers and thus impossible for them to audit. And even if they learned about the trojan's SHA1 they would be unable to remove it from their repository because they have no way to find out the name of the hidden reference! Obviously these hacks would only be possible for a bad guy with push privileges to a repository that has turned on hidden references, but I think they are sobering nevertheless. These worries would go away if reference hiding were configured on the client rather than on the server. A second point: currently, the output of "git show-ref -d" and "git ls-remote ." are almost identical. Under your proposal, I believe that the hiderefs would only be omitted from the latter. Would it be useful to add an option to "git show-ref" to make it omit the "hiderefs" refs? And maybe another option to make it display *only* the hideref refs? And in the bikeshedding department, I wonder if "hiderefs" is the best name for the config setting. "hiderefs", implies to me that the refs are actively hidden and not available to the client in any way. But in fact they are just not advertised; they can be fetched normally. Maybe another name would be more suggestive of its true effect, for example "quietrefs" or "noadvertiserefs". Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html