On Thu, 23 Oct 2008 00:10:19 +0200 Mikkel Høgh <m@xxxxxx> wrote: > Hi, > I'm trying to make a module allowing Drupal to take advantage of > PostgreSQL's excellent Full Text Search, aka. tsearch. > Since this module will probably not become part of Drupal core > right off the bat, I need to do this without modifying Drupal's > own tables, so I've created a new one for the purpose, > "tsearch_node". > > I'm pondering how best to do this. Basic minimum is a column which > holds the foreign key (nid) to Drupal's node. > > I'm wondering whether to merge all the searchable stuff when > creating the index or to have a separate tsvector column for each > of the three commonly indexable things about Drupal content > (nodes), namely the title, the body and taxonomy. Any insights? weight them and you'll be able to search by field and "globally". I didn't make any scientific test but I previously had something like: create table subtable ( subtableid int, body text, ftidx tsvector ) create table maintable ( maintableid int, body text, subtableid int ftidx tsvector ); but it seem that just searching on a tsvector in maintable build up with setweight(to_tsvector('pg_catalog.english', coalesce(maintable.body,'')), 'A') || ' ' || setweight(to_tsvector('pg_catalog.english', coalesce(subtable.body,'')), 'B') is faster. Beware of the difference between gist and gin indexes for "restricted" weighted searches since with the latter you've to use @@@ > Also, I figure it'd be a good idea to keep the language of the > indexed content right there in the same table, since Drupals own > node table specifies two-letter codes (en, da, pt, etc.) and as > far as I can understand the PostgreSQL documentation, tsearch > expects the full language name (english, danish, portuguese), so > to use it in queries without having to do too much magic would > require me to store the full name right there in the table, or am > I mistaken? You'll have to build up a tsvector for each language, so yeah it may be useful to store the tsvector together with the language with witch it was obtained. If you don't tsearch won't work. eg. "cane" with Italian tsearch becomes {can}, but when you search it with English tsearch it remains {cane} so you won't find it. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general