Hi
I am seeking some clarification in regard to full text search
across multiple tables and what the best approach is.
Documentation talks about two approaches when it comes to building
a document: on-the-fly concat of columns and a dedicated tsv
column approach. Let's say I want to perform a search for "txt1
& txt2 & txt3" on columns table1.col1, table1.col2 and
table2.col1. I see the following solutions:
1. Concat all three into a document and perform a FTS.
SELECT * FROM (
SELECT to_tsvector(table1.col1) ||
to_tsvector(table1.col2) ||
to_tsvector(table2.col1) as document F
ROM table1 LEFT JOIN table2 ON table1.table2_id=table2.id
) subquery
WHERE subquery.document @@ to_tsquery(unaccent(?));
2. Create a tsv column in each table, concat tsv columns
and perform FTS on that.
SELECT * FROM table1 LEFT JOIN table2 ON table1.table2_id=table2.id WHERE table1.tsv || tale2.tsv @@ to_tsquery(unaccent(?));
3. Have a tsv column only in table1 and insert table2.col1
to the tsv via triggers. Works but seems very hacky.
It seems to me that option #2 is fast and easy to implement
but I am not sure what the concat of tsvs really means from
index usage and performance standpoint. Option #1 is the most
flexible and I'd use that all the time if it was not THAT much
slower than tsv column approacj. Documentation on TSV columns
states: "
Another advantage is that searches will be
faster, since it will not be necessary to redo the to_tsvector
calls to verify index matches."
The question is, how much faster are tsv columns really? Are
there any benchmarks about this? If the performance difference is
negligible I'd advocate that using tsv columns is a waste of time
and space in most general cases. But since there is no information
on how much faster it's hard to decide.
Best regards,
Klemen