On Mon, May 22, 2017 at 3:27 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> I liked the suggestion to make the URL a relative path, but this would >> require you to maintain a mirror in the same places you push git.git >> to, is that something you'd be willing to do? > > After thinking about this a bit more, I know what I think we want a > bit better. > > Relative URL (e.g. ../sha1collisiondetection that sits next to the > copy of git.git) may be a good way to go. I can arrange to create > necessary repository next to git.git on k.org and github.com but I > need to double check about other places And here we see another deficit with a single URL: We have to abide by the same scheme at all hosting endpoints. For example consider the host https://kernel.googlesource.com/pub/scm/git/git that mirrors from kernel.org. It would be able to bind the submodule at https://kernel.googlesource.com/pub/scm/git/git/sha1dc i.e. it would look like a subdirectory of the main git repo. This is not an issue for our desired usecase, as all hosts can comply with the scheme that you outlined (url=../sha1...), but worth noting that in the long term we may want to have the ability to "configure" each remote individually by having out-of-history config options. I think we would want to solve that via a "refs/meta/gitmodules" branch that can be adapted per remote. (original idea from jrnieder@) > Whether the submodule is referenced by a relative URL from the main > project, the submodule should not come directly from the upstream, > and various mirrors that sit next to git.git should not be blind and > automated "mirrors". That sounds reasonable for our sanity. > This is because I do not want us to trust the > security measures of https://github.com/cr-marcstevens/ repository. > The consumers already need to trust k.org/pub/scm/git/git.git and by > ensuring k.org/pub/scm/git/sha1dc is managed the same way, they do > not have to trust anything extra. The trust would be transitive, as the said submodule is referenced via sha1, so all malicious actions upstream could perform are: * denial of service: (by remove a commit that we pointed at in our history) * denial of service 2: add a huge blob to their repo, such that anyone obtaining the submodule not carefully is annoyed by a super large repo. * add additional malicious data (such as illegal numbers and algorithms) to a branch, which would be obtained by users cloning the submodule carelessly. > Another reason is that we want to make sure all commits in the > submodule that we bind to the superproject (i.e. git.git) are always > in the submodule, regardless of what our upstream does, and one way > to do so is to have control over _our_ canonical repository for the > submodule. By having all repos under one entity of trust, we would not need to discuss all kinds of possible attacks as above. > In normal times, it will faithfully follow the upstream > without doing anything else, but we'd keep the option of anchoring a > submodule commit that is referenced by the superproject history with > our own tag, if it is ever rewound away in the upstream history for > whatever reason. That makes sense. Thanks, Stefan