> jesper@xxxxxxxx wrote: >> >> So getting them with AND inbetween gives x100 better performance. All >> queries are run on "hot disk" repeated 3-5 times and the number are from >> the last run, so disk-read effects should be filtered away. >> >> Shouldn't it somehow just do what it allready are capable of doing? > > I'm guessing to_tsquery(...) will produce a tree of search terms (since > it allows for quite complex expressions). Presumably there's a standard > order it gets processed in too, so it should be possible to generate a > more or less efficient ordering. > > That structure isn't exposed to the planner though, so it doesn't > benefit from any re-ordering the planner would normally do for normal > (exposed) AND/OR clauses. > > Now, to_tsquery() can't re-order the search terms because it doesn't > know what column it's being compared against. In fact, it might not be a > simple column at all. I cant follow this logic based on explain output, but I may have misunderstood something. The only difference in these two query-plans is that we have an additional or'd term in the to_tsquery(). What we see is that, the query-planner indeed has knowledge about changes in the row estimates based on changes in the query to to_tsquery(). My guess is that it is because to_tsquery actually parses the query and give the estimates, now how can to_tsquery give estimates without having access to the statistics for the column? ftstest=# explain select id from ftsbody where ftsbody_body_fts @@ to_tsquery('reallyrare'); QUERY PLAN --------------------------------------------------------------------------------- Bitmap Heap Scan on ftsbody (cost=132.64..190.91 rows=29 width=4) Recheck Cond: (ftsbody_body_fts @@ to_tsquery('reallyrare'::text)) -> Bitmap Index Scan on ftsbody_tfs_idx (cost=0.00..132.63 rows=29 width=0) Index Cond: (ftsbody_body_fts @@ to_tsquery('reallyrare'::text)) (4 rows) ftstest=# explain select id from ftsbody where ftsbody_body_fts @@ to_tsquery('reallyrare | morerare'); QUERY PLAN ------------------------------------------------------------------------------------- Bitmap Heap Scan on ftsbody (cost=164.86..279.26 rows=57 width=4) Recheck Cond: (ftsbody_body_fts @@ to_tsquery('reallyrare | morerare'::text)) -> Bitmap Index Scan on ftsbody_tfs_idx (cost=0.00..164.84 rows=57 width=0) Index Cond: (ftsbody_body_fts @@ to_tsquery('reallyrare | morerare'::text)) (4 rows) ftstest=# explain select id from ftsbody where ftsbody_body_fts @@ to_tsquery('reallyrare | reallycommon'); QUERY PLAN -------------------------------------------------------------------------- Seq Scan on ftsbody (cost=0.00..1023249.39 rows=5509293 width=4) Filter: (ftsbody_body_fts @@ to_tsquery('reallyrare | reallycommon'::text)) (2 rows) > 2. A variant to_tsquery_with_sorting() which would take the column-name > or something and look up the stats to work against. Does above not seem like its there allready? (sorry.. looking at C-code from my point of view would set me a couple of weeks back, so I have troble getting closer to the answer than interpreting the output and guessing the rest). -- Jesper -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance