Re: [PATCH v3 4/4] urlmatch: allow globbing for the URL host part

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 26, 2017 at 12:43:31PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <patrick.steinhardt@xxxxxxxx> writes:
> 
> > The URL matching function computes for two URLs whether they match not.
> > The match is performed by splitting up the URL into different parts and
> > then doing an exact comparison with the to-be-matched URL.
> >
> > The main user of `urlmatch` is the configuration subsystem. It allows to
> > set certain configurations based on the URL which is being connected to
> > via keys like `http.<url>.*`. A common use case for this is to set
> > proxies for only some remotes which match the given URL. Unfortunately,
> > having exact matches for all parts of the URL can become quite tedious
> > in some setups. Imagine for example a corporate network where there are
> > dozens or even hundreds of subdomains, which would have to be configured
> > individually.
> >
> > This commit introduces the ability to use globbing in the host-part of
> > the URLs. A user can simply specify a `*` as part of the host name to
> > match all subdomains at this level. For example adding a configuration
> > key `http.https://*.example.com.proxy` will match all subdomains of
> > `https://example.com`.
> 
> This is probably a useful improvement.
> 
> Having said that, when I mentioned "glob", I meant to also support
> something like this:
> 
> 	https://www[1-4].ibm.com/

The problem with additional extended syntax like proposed by you
is that we would indeed need an escaping mechanism here. '[]' are
already allowed inside the host part to enable IPv6 hosts of the
form 'https://[2001:0db8:]/', so the syntax is now ambiguous. So
we have to be cautios which characters to enable for globbing
syntax. As of now, I think we can only safely include '*' and '?'
here without escaping mechanisms.

If additional use cases come up we might still extend the syntax
later on to allow for more special syntax.

> And when people read "glob", that is what they expect.
> 
> So calling this "the ability to use globbing" is misleading.
> The last paragraph in the log message above needs a bit of
> tweaking, perhaps like this:
> 
> 	Allow users to write an asterisk '*' in place of any 'host'
> 	or 'subdomain' label as part of the host name.  For example,
> 	"http.https://*.example.com.proxy"; sets "http.proxy" for all
> 	direct subdomains of "https://example.com";,
> 	e.g. "https://foo.example.com";, but not
> 	"https://foo.bar.example.com";.
> 
> Fortunately, your update to config.txt, which is facing the end
> users, does not misuse the word and instead is explicit that the
> only thing the matcher does is to match '*' to a single hierarchy.
> It is clear that even http://www*.ibm.com/ is not supported from
> the description, which is good.

I agree that globbing is the wrong word here. I'll swap in
"wildcard" where applicable.

I'll send a version 4 later on. Thanks again for your feedback
and improvements.

Regards
Patrick

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]