Search Postgresql Archives

Re: regexp_matches problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Who many thanks Laurenz.
You gave me the right direction.
The following is what I made up and it does exactly what I want it to do:
	--collect the specific string data for each result row 
SELECT (
	SELECT match[1] FROM
regexp_matches(secondresult.imgstring,'title="([^"]+)"','ig') AS match
	) AS titletekst
	,(
	SELECT match[1] FROM
regexp_matches(secondresult.imgstring,'alt="([^"]+)"','ig') AS match
	) AS alttekst
	,(
	SELECT match[1] FROM
regexp_matches(secondresult.imgstring,'src="([^"]+)"','ig') AS match
	) AS imgsrc
FROM (
--collect the image string of every found image
	SELECT firstresult[1] AS imgstring
	FROM regexp_matches('<img src="wwww" title="titletext "
class="class12">'
		,'<img[[:space:]]+([^>]*)>'
		,'ig'
		) AS firstresult
) as secondresult


This is it. Very simple and very efficient.

It took me ages to find out. Hopefully it helps other people.

Willem

-----Oorspronkelijk bericht-----
Van: pgsql-general-owner@xxxxxxxxxxxxxx
[mailto:pgsql-general-owner@xxxxxxxxxxxxxx] Namens Albe Laurenz
Verzonden: woensdag 13 mei 2009 10:50
Aan: WP Perquin *EXTERN*; pgsql-general@xxxxxxxxxxxxxx
Onderwerp: Re:  regexp_matches problem

WP Perquin wrote:
> When I make the following simplified example:
> 
> SELECT regexp_matches('<img src="wwww" title="dit is een 
> title tekst" class="class12">'
> 
> ,'((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig')
> 
> My result are 2 rows:
> 
> "{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}"
> 
> "{"title=\"dit is een title tekst\"","title=\"dit is een 
> title tekst\"","dit is een title tekst",NULL,NULL,NULL}"
> 
> I would like to have 1 row which contains both the records. 
> Does anyone know how I can solve this?

Do you really want all those NULLs?

Is that what you want:

SELECT match[1]
FROM regexp_matches('<img src="wwww" title="dit is een title tekst"
class="class12">',
                    '(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")',
                    'ig') AS match;

This query returns two rows.

When you write that you want to have one row that contains both
records, do you mean:

a) one string that is the concatenation of both strings
or
b) one row that is a single array with two string elements

Whatever it is you want, you will probably need to write a
little aggregate function that does that for you, something like

CREATE FUNCTION text_cats(state text[], nextv text) RETURNS text[]
   IMMUTABLE STRICT LANGUAGE sql
   AS 'SELECT $1 || $2';

CREATE AGGREGATE text_cat(text) (
   SFUNC = text_cats,
   STYPE = text[],
   INITCOND = '{}' );

for variant b).

Then you can

SELECT text_cat(match[1])
FROM regexp_matches('<img src="wwww" title="dit is een title tekst"
class="class12">',
                    '(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")',
                    'ig') AS match;

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux