On Fri, Dec 14, 2018 at 11:55:30AM +0100, Ævar Arnfjörð Bjarmason wrote: > > An interesting implication of this at GitHub (and possibly other > > hosters) is that it exposes objects from shared storage via unexpected > > repos. If I fork torvalds/linux to peff/linux and push up object X, a v0 > > fetch will (by default) refuse to serve it to me. But v2 will happily > > hand it over, which may confuse people into thinking that the object is > > (or at least at some point was) in Linus's repository. > > More importantly this bypasses the security guarantee we've had with the > default of uploadpack.allowAnySHA1InWant=false. IMHO those security guarantees there are overrated (due to delta guessing attacks, though things are not quite as bad if the attacker can't actually push to the repo). But I agree that people do assume it's the case. I was certainly surprised by the v2 behavior, and I don't remember that aspect being discussed. (I suspect it was mostly because in the stateless world, the "fetch" operation requires iterating all of the refs again. That's made even harder by the fact that "ls-refs" takes arguments now, and might have only advertised a subset of the refs. How can "fetch" know which ones were advertised?). > a) Subject to the "SECURITY" section of git-fetch, i.e. maybe this > doesn't even work in the face of a determined attacker. Yeah, this is the part I was thinking about above. > b) Hosting sites like GitHub, GitLab, Gitweb etc. aren't doing > reachability checks when you view /commit/<id>. If I delete my topic > branch it'll be viewable until the next GC kicks in (which would > also be the case with uploadpack.allowAnySHA1InWant=true). Actually, we don't ever prune unless a user asks us to (and certainly users do ask us to after accidentally pushing secret stuff). In GitHub's case, though, we always tell people to consider anything pushed to a public repo to be non-secret, even if it was exposed for only a few seconds. My understanding is that there are literally clients watching the public feeds and sucking down repo data looking for these kinds of mistakes. The /commit/<id> thing has been a frequent topic of discussion within GitHub over the years, and we have looked at actually doing reachability checks. They're expensive, though bitmaps help. > So I wonder how much we need to care about this in practice, but for > some configurations protocol v2 definitely breaks a security promise > we've been making, now whether that promise was always BS due to "a)" > above is another matter. > > I'm inclined to say that in the face of that "SECURITY" section we > should just: > > * Turn on uploadpack.allowReachableSHA1InWant for v0/v1 by > default. Make saying uploadpack.allowReachableSHA1InWant=false warn > with "this won't work, see SECURITY...". > > * The uploadpack.allowTipSHA1InWant setting will also be turned on by > default, and will be much faster, since it'll just degrade to > uploadpack.allowReachableSHA1InWant=true and we won't need any > reachability check. We'll also warn saying that setting it is > useless. No real argument from me. I have always thought those security guarantees were BS. However, I do think that's all orthogonal to the issue I was mentioning, which is that it looks like repo A wants to serve you object X, even if that repo never saw the object. That's unique to shared storage schemes, though. IMHO this is also a user education issue (it's a question of trust: refs are the source of trust, and objects don't need to be trusted because they're immutable). But it's pretty subtle for people who don't know the inner workings of Git. -Peff