Re: [PATCH 2/2] submodule: munge paths to submodule git directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 14, 2018 at 2:12 PM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
>
> Hi,
>
> Stefan Beller wrote:
> > On Tue, Aug 14, 2018 at 11:57 AM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
>
> >> Second, what if we store the pathname in config?  We already store the
> >> URL there:
> >>
> >>         [submodule "plugins/hooks"]
> >>                 url = https://gerrit.googlesource.com/plugins/hooks
> >>
> >> So we could (as a followup patch) do something like
> >>
> >>         [submodule "plugins/hooks"]
> >>                 url = https://gerrit.googlesource.com/plugins/hooks
> >>                 gitdirname = plugins%2fhooks
> >>
> >> and use that for lookups instead of regenerating the directory name.
> >> What do you think?
> >
> > As I just looked at worktree code, this sounds intriguing for the wrong
> > reason (again), as a user may want to point the gitdirname to a repository
> > that they have already on disk outside the actual superproject. They
> > would be reinventing worktrees in the submodule space. ;-)
> >
> > This would open up the security hole that we just had, again.
> > So we'd have to make sure that the gitdirname (instead of the
> > now meaningless subsection name) is proof to ../ attacks.
> >
> > I feel uneasy about this as then the user might come in
> > and move submodules and repoint the gitdirname...
> > to a not url encoded path. Exposing this knob just
> > asks for trouble, no?
>
> What if we forbid directory separator characters in the gitdirname?

Fine with me, but ideally we'd want to allow sharding the
submodules. When you have 1000 submodules
we'd want them not all inside the toplevel "modules/" ?
Up to now we could just wave hands and claim the user
(who is clearly experienced with submodules as they
use so many of them) would shard it properly.

With this scheme we loose the ability to shard.

> [...]
> > What would happen if gitdirname is changed as part of
> > history? (The same problem we have now with changing
> > the subsection name)
>
> In this proposal, it would only be read from config, not from
> .gitmodules.

Ah good point. That makes sense.

Stepping back a bit regarding the config:
When I clone gerrit (or any repo using submodules)

$ git clone --recurse-submodules \
  https://gerrit.googlesource.com/gerrit g2
[...]
$ cat g2/.git/config
[submodule]
    active = .
[submodule "plugins/codemirror-editor"]
    url = https://gerrit.googlesource.com/plugins/codemirror-editor
[... more urls to follow...]

Originally we have had the url in the config, (a) that we can change
the URLs after the "git submodule init" and "git submodule update"
step that actually clones the submodule if not present and much more
importantly (b) to know which submodule "was initialized/active".

Now that we have the submodule.active or even
submodule.<name>.active flags, we do not need (b) any more.
So the URL turns into a useless piece of cruft that just is unneeded
and might confuse the user.

So maybe I'd want to propose a patch that removes
submodule.<name>.url from the config once it is cloned.
(I just read up on "submodule sync" again, but that might not
even need special care for this new world)

And with all that said, I think if we can avoid having the submodules
gitdir in the config, the config would look much cleaner, too.

But maybe that is the wrong thing to optimize for. ;-)
It just demonstrates that we'd have a submodule specific
thing again in the config.

So my preference would be to do a similar thing as
url-encoding as that solves the issue of slashes and
potentially of case sensitivity (e.g. encode upper case A
as lower case with underscore _a)

However the transition worries me, as it transitions
within the same namespace. Back then when we
transferred from the .git dir inside the submodules
working tree to the embedded version in the superprojects
.git dir, there was no overlap, and any potential directory
in .git/modules/ that was already there, was highly
unusual, so asking the user for help is the reasonable
thing to do.
But now we might run into issues that has overlap between
old(name as is) and new (urlencoded) world.

So maybe we also want to transition from

    modules/<name>

to

    submodules/<urlencoded(<name>)>

Thanks,
Stefan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux