thanks for the pointers .... unfortunately its not just attribute names.
Here is what I am thinking of doing;
In a first step I run a query
SELECT id FROM time_series WHERE
to_tsvector(xml_string)
@@
to_tsquery( anystring );
to_tsvector(xml_string)
@@
to_tsquery( anystring );
then I load the actual xml string into memory for each id found and use xpath to search the document in memory. This will at least use my text index on the first hit.
Thanks
Chris
On Tue, Mar 16, 2010 at 4:16 PM, Richard Huxton <dev@xxxxxxxxxxxx> wrote:
On 16/03/10 13:49, Richard Huxton wrote:Actually, if it's only attribute names you're interested in you could do it with xpath
You could run an xslt transform over the xml fragments and extract what
you want and then use tsearch to index that, I suppose. Similarly, you
might be able to do the same via xslt and xquery.
Something like (untested):
ALTER TABLE time_series ADD attr_names text;
UPDATE time_series SET attr_names = array_to_string(
xpath('*/Attribute/Name/text()', external_attributes)
,' '
);
CREATE INDEX fti_attr_names ON time_series USING gin(
to_tsvector('simple', attr_names)
);
SELECT * FROM time_series WHERE
to_tsvector('simple', attr_names)
@@
to_tsquery('simple', 'attribute22');
I'd probably just store the tsvector rather than text unless the text is of some use in itself.
If you plan to do anything with the attributes it'd still be better to split them out into their own table though.
--
Richard Huxton
Archonet Ltd
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general