I have added this as a TODO: * Improve handling of plus signs in email address user names, and perhaps improve URL parsing * http://archives.postgresql.org/pgsql-hackers/2010-10/msg00772.php --------------------------------------------------------------------------- Thom Brown wrote: > Hi, > > I noticed that if I run this: > > SELECT alias, description, token FROM > ts_debug('http://www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary'); > > I get: > > alias | description | token > ----------+---------------+----------------------------------------------------------------- > protocol | Protocol head | http:// > url | URL | > www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary > host | Host | www.postgresql.org:2345 > url_path | URL path | > /directory/page.html?version=9.1&build=alpha1#summary > (4 rows) > > > It could be me being picky, but I don't regard parameters or page > fragments as part of the URL path. Ideally, I'd sort of expect: > > alias | description | token > --------------+---------------+----------------------------------------------------------------- > protocol | Protocol head | http:// > url | URL | > www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary > host | Host | www.postgresql.org > port | Port | 2345 > url_path | URL path | /directory/page.html > query_string | Query string | version=9.1&build=alpha1 > fragment | Page fragment | summary > (7 rows) > > ... of course that's if there was support for query strings and page > fragments, which there isn't. But if changes were made to support my > definition of a URL path, they'd have to be considered breaking > changes. > > But my main gripe is with the name "url_path". > > Also: > > SELECT alias, description, token FROM ts_debug('myname+priority@xxxxxxxxx'); > > Yields: > > alias | description | token > -----------+-----------------+-------------------- > asciiword | Word, all ASCII | myname > blank | Space symbols | + > email | Email address | priority@xxxxxxxxx > (3 rows) > > The entire string I entered is a valid email address, and isn't > totally uncommon. Shouldn't that take such email address styles be > taken into account? The example above incorrectly identifies the > email address since the real destination address would most likely be > myname@xxxxxxxxxx > > -- > Thom Brown > Twitter: @darkixion > IRC (freenode): dark_ixion > Registered Linux user: #516935 > > -- > Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general