On Tue, Aug 14, 2018 at 11:57 AM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > > Hi, > > Brandon Williams wrote: > > On 08/09, Jeff King wrote: > > >> One interesting thing about url-encoding is that it's not one-to-one. > >> This case could also be %2F, which is a different file (on a > >> case-sensitive filesystem). I think "%20" and "+" are similarly > >> interchangeable. > >> > >> If we were decoding the filenames, that's fine. The round-trip is > >> lossless. > >> > >> But that's not quite how the new code behaves. We encode the input and > >> then check to see if it matches an encoding we previously performed. So > >> if our urlencode routines ever change, this will subtly break. > >> > >> I don't know how much it's worth caring about. We're not that likely to > >> change the routines ourself (though certainly a third-party > >> implementation would need to know our exact url-encoding decisions). > > > > This is exactly the reason why I wanted to get some opinions on what the > > best thing to do here would be. I _think_ the best thing would probably > > be to write a specific routine to do the conversion, and it wouldn't > > even have to be all that complex. Basically I'm just interested in > > converting '/' characters so that things no longer behave like > > nested directories. > > First of all, I think the behavior with this patch is already much > better than the previous status quo. I'm using the patch now and am > very happy with it. > > Second, what if we store the pathname in config? We already store the > URL there: > > [submodule "plugins/hooks"] > url = https://gerrit.googlesource.com/plugins/hooks > > So we could (as a followup patch) do something like > > [submodule "plugins/hooks"] > url = https://gerrit.googlesource.com/plugins/hooks > gitdirname = plugins%2fhooks > > and use that for lookups instead of regenerating the directory name. > What do you think? As I just looked at worktree code, this sounds intriguing for the wrong reason (again), as a user may want to point the gitdirname to a repository that they have already on disk outside the actual superproject. They would be reinventing worktrees in the submodule space. ;-) This would open up the security hole that we just had, again. So we'd have to make sure that the gitdirname (instead of the now meaningless subsection name) is proof to ../ attacks. I feel uneasy about this as then the user might come in and move submodules and repoint the gitdirname... to a not url encoded path. Exposing this knob just asks for trouble, no? On the other hand, the only requirement for the "name" is now uniqueness, and that is implied with subsections, so I guess it looks elegant. What would happen if gitdirname is changed as part of history? (The same problem we have now with changing the subsection name) The more I think about it the less appealing this is, but it looks elegant. Stefan