On 22 April 2014 08:40, Heikki Linnakangas <hlinnakangas@xxxxxxxxxx> wrote: > On 04/20/2014 02:15 AM, Ivan Voras wrote: >> More details: after thinking about it some more, it might have >> something to do with tsearch2 and indexes: the large data in this case >> is a tsvector, indexed with GIN, and the query plan involves a >> re-check condition. >> >> The query is of the form: >> SELECT simple_fields FROM table WHERE fts @@ to_tsquery('...'). >> >> Does the "re-check condition" mean that the original tsvector data is >> always read from the table in addition to the index? > > > Yes, if the re-check condition involves the fts column. I don't see why you > would have a re-check condition with a query like that, though. Are there > some other WHERE-conditions that you didn't show us? Yes, I've read about tsearch2 and GIN indexes and there shouldn't be a recheck condition - but there is. This is the query: SELECT documents.id, title, raw_data, q, ts_rank(fts_data, q, 4) AS rank, html_filename FROM documents, to_tsquery('document') AS q WHERE fts_data @@ q ORDER BY rank DESC LIMIT 25; And here is the explain analyze: http://explain.depesz.com/s/4xm It clearly shows a bitmap index scan operation is immediately followed by a recheck operation AND that the recheck operation actually does something, because it reduces the number of records from 61 to 58 (!!!). This is the table structure: nn=# \d documents Table "public.documents" Column | Type | Modifiers ---------------+----------+-------------------------------------------------------- id | integer | not null default nextval('documents_id_seq'::regclass) ctime | integer | not null default unix_ts(now()) dtime | integer | not null title | text | not null html_filename | text | not null raw_data | text | fts_data | tsvector | not null tags | text[] | dtype | integer | not null default 0 flags | integer | not null default 0 Indexes: "documents_pkey" PRIMARY KEY, btree (id) "documents_html_filename" UNIQUE, btree (html_filename) "documents_dtime" btree (dtime) "documents_fts_data" gin (fts_data) "documents_tags" gin (tags) > The large fields are stored in the toast table. You can check if the toast > table is accessed with a query like this: > > select * from pg_stat_all_tables where relid = (select reltoastrelid from > pg_class where relname='table'); > > Run that before and after your query, and see if the numbers change. Before: relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del|n_tup_hot_upd|n_live_tup|n_dead_tup|last_vacuum|last_autovacuum|last_analyze|last_autoanalyze|vacuum_count|autovacuum_count|analyze_count|autoanalyze_count 27290|pg_toast|pg_toast_27283|3|0|2481289|10631453|993194|0|266306|0|147931|2514||2014-04-18 00:49:11.066443+02|||0|11|0|0 After: relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del|n_tup_hot_upd|n_live_tup|n_dead_tup|last_vacuum|last_autovacuum|last_analyze|last_autoanalyze|vacuum_count|autovacuum_count|analyze_count|autoanalyze_count 27290|pg_toast|pg_toast_27283|3|0|2481347|10632814|993194|0|266306|0|147931|2514||2014-04-18 00:49:11.066443+02|||0|11|0|0 idx_scan has changed from 2481289 to 2481347 (58) idx_tup_fetch has changed from 10631453 to 10632814 (1361) Number 58 corresponds to the number of rows found by the index, seen in the EXPLAIN output, I don't know where 1361 comes from. I'm also surprised by the amount of memory used for sorting (23 kB), since the actually returned data from my query (all the tuples from all the 58 rows) amount to around 2 kB - but this is not an actual problem. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance