Re: PostgreSQL SQL Tricks: faster urldecode

Marc Mamin <M.Mamin@xxxxxxxxxxxx> · Fri, 20 Sep 2013 16:47:16 +0000



> Von: Merlin Moncure [mmoncure@xxxxxxxxx]
> Gesendet: Freitag, 20. September 2013 17:43
> 
> >  On Fri, Sep 20, 2013 at 10:26 AM, Marc Mamin <M.Mamin@xxxxxxxxxxxx> wrote:
> > Hi,
> > here is a function which is about 8 x faster than the one described in the PostgreSQL SQL Tricks
> > ( http://postgres.cz/wiki/PostgreSQL_SQL_Tricks#Function_for_decoding_of_url_code )
> >
> > The idea is to handle each encoded/not_encoded parts in bulk rather than spliting on each character.
> >
> > urldecode_arr:
> > Seq Scan on lt_referrer  (actual time=1.966..17623.979 rows=65717 loops=1)
> >
> > urldecode:
> > Seq Scan on lt_referrer  (actual time=4.846..144445.292 rows=65717 loops=1)
> 
> very nice.  Basically it comes down to this: all non-trivial regex
> replacements require decomposition of the string into an array because
> regexp_replace() is unable to do any kind of transformation on the
> string.  This is a crippling limitation relative to first-class regex
> languages like perl; postgres string translation functions are
> invisible to the regex engine.  I have no idea if this is fixable (I
> dimly recall Tom explaining why it might not be).
> 
> merlin

yes, a possible(?) assistance for such problems would be a new variant of regexp_split_to_table
that would return two columns: 
- the splitted parts (as currently)
- the separator matches (new)

Marc


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general