I have a message posted in pgsql-general that outlines what I thought was an indexing problem - it's not, so I'm bringing it here. I dumped the table from our production system and stuffed it into a test machine, then started refining and playing with the query until I was able to get it to the "de-minimus" that showed the issue. Note that the actual query is frequently MUCH more complicated, but without the LIMIT shown below the planner seems to do a decent job of figuring out how to "get it done." The actual table in question has ~2m rows totaling several gigabytes of space. Here's an abstract of the schema: Table "public.post" Column | Type | Modifiers -----------+--------------------------+-------------------------------------------------------- subject | text | message | text | inserted | timestamp with time zone | modified | timestamp with time zone | replied | timestamp with time zone | ordinal | integer | not null default nextval('post_ordinal_seq'::regclass) Indexes: "post_pkey" PRIMARY KEY, btree (ordinal) "idx_message" gin (to_tsvector('english'::text, message)) "idx_subject" gin (to_tsvector('english'::text, subject)) There's a bunch of other stuff in the table and many more indices, plus foreign references, but stripping the table down to JUST THIS shows the problem. ticker=# explain analyze select * from post where to_tsvector('english', message) @@ to_tsquery('violence') order by modified desc; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Sort (cost=31795.16..31819.68 rows=9808 width=436) (actual time=14.222..17.213 rows=3421 loops=1) Sort Key: modified Sort Method: quicksort Memory: 3358kB -> Bitmap Heap Scan on post (cost=1418.95..31144.90 rows=9808 width=436) (actual time=1.878..7.514 rows=3421 loops=1) Recheck Cond: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text)) -> Bitmap Index Scan on idx_message (cost=0.00..1416.49 rows=9808 width=0) (actual time=1.334..1.334 rows=3434 loops=1) Index Cond: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text)) Total runtime: 20.547 ms (8 rows) Ok, very nice. 20ms. I like that. Now lets limit the return to 100 items: ticker=# explain analyze select * from post where to_tsvector('english', message) @@ to_tsquery('violence') order by modified desc limit 100; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..5348.69 rows=100 width=436) (actual time=198.047..2607.077 rows=100 loops=1) -> Index Scan Backward using post_modified on post (cost=0.00..524599.31 rows=9808 width=436) (actual time=198.043..2606.864 rows=100 loops=1) Filter: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text)) Total runtime: 2607.231 ms (4 rows) Bad. Notice that the optimizer decided it was going to do an index scan with an internal filter on it! That's BACKWARD; what I want is for the planner to first execute the index scan on the GIN index, then order the return and limit the returned data set. But it gets much worse - let's use something that's NOT in the message base (the table in question has some ~2m rows by the way and consumes several gigabytes on disk - anything that actually READS the table is instant "bad news!") ticker=# explain analyze select * from post where to_tsvector('english', message) @@ to_tsquery('hosehead') order by modified; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Sort (cost=31795.16..31819.68 rows=9808 width=436) (actual time=0.407..0.407 rows=0 loops=1) Sort Key: modified Sort Method: quicksort Memory: 25kB -> Bitmap Heap Scan on post (cost=1418.95..31144.90 rows=9808 width=436) (actual time=0.402..0.402 rows=0 loops=1) Recheck Cond: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) -> Bitmap Index Scan on idx_message (cost=0.00..1416.49 rows=9808 width=0) (actual time=0.399..0.399 rows=0 loops=1) Index Cond: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) Total runtime: 0.441 ms (8 rows) Very fast, as you'd expect - it returned nothing. Now let's try it with a "LIMIT": ticker=# explain analyze select * from post where to_tsvector('english', message) @@ to_tsquery('hosehead') order by modified limit 100; NOTICE: word is too long to be indexed DETAIL: Words longer than 2047 characters are ignored. QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..5348.69 rows=100 width=436) (actual time=254217.850..254217.850 rows=0 loops=1) -> Index Scan using post_modified on post (cost=0.00..524599.31 rows=9808 width=436) (actual time=254217.847..254217.847 rows=0 loops=1) Filter: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) Total runtime: 254217.891 ms (4 rows) ticker=# Oh crap. It actually went through and looked at the entire freaking table - one message at a time. An attempt to re-write the query into something that FORCES the planner to do the right thing fails too. For example: ticker=# explain analyze select * from post where ordinal in (select ordinal from post where to_tsvector('english', message) @@ to_tsquery('hosehead')) order by modified; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=94886.44..94910.96 rows=9808 width=436) (actual time=0.884..0.884 rows=0 loops=1) Sort Key: public.post.modified Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=31173.42..94236.19 rows=9808 width=436) (actual time=0.879..0.879 rows=0 loops=1) -> HashAggregate (cost=31173.42..31271.50 rows=9808 width=4) (actual time=0.877..0.877 rows=0 loops=1) -> Bitmap Heap Scan on post (cost=1422.95..31148.90 rows=9808 width=4) (actual time=0.850..0.850 rows=0 loops=1) Recheck Cond: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) -> Bitmap Index Scan on idx_message (cost=0.00..1420.50 rows=9808 width=0) (actual time=0.848..0.848 rows=0 loops=1) Index Cond: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) -> Index Scan using post_ordinal on post (cost=0.00..6.41 rows=1 width=436) (never executed) Index Cond: (public.post.ordinal = public.post.ordinal) Total runtime: 0.985 ms (12 rows) Fast, if convoluted. ticker=# explain analyze select * from post where ordinal in (select ordinal from post where to_tsvector('english', message) @@ to_tsquery('hosehead')) order by modified limit 100; NOTICE: word is too long to be indexed DETAIL: Words longer than 2047 characters are ignored. QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.00..19892.88 rows=100 width=436) (actual time=270563.091..270563.091 rows=0 loops=1) -> Nested Loop Semi Join (cost=0.00..1951093.77 rows=9808 width=436) (actual time=270563.088..270563.088 rows=0 loops=1) -> Index Scan using post_modified on post (cost=0.00..509887.63 rows=1961557 width=436) (actual time=0.015..3427.627 rows=1953674 loops=1) -> Index Scan using post_ordinal on post (cost=0.00..0.73 rows=1 width=4) (actual time=0.134..0.134 rows=0 loops=1953674) Index Cond: (public.post.ordinal = public.post.ordinal) Filter: (to_tsvector('english'::text, public.post.message) @@ to_tsquery('hosehead'::text)) Total runtime: 270563.147 ms (7 rows) ticker=# Ok, that didn't work either. Interestingly enough, if I crank up the limit to 500, it starts behaving! ticker=# explain analyze select * from post where ordinal in (select ordinal from post where to_tsvector('english', message) @@ to_tsquery('hosehead')) order by modified limit 500; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=94724.91..94726.16 rows=500 width=436) (actual time=1.475..1.475 rows=0 loops=1) -> Sort (cost=94724.91..94749.43 rows=9808 width=436) (actual time=1.473..1.473 rows=0 loops=1) Sort Key: public.post.modified Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=31173.43..94236.19 rows=9808 width=436) (actual time=1.468..1.468 rows=0 loops=1) -> HashAggregate (cost=31173.43..31271.51 rows=9808 width=4) (actual time=1.466..1.466 rows=0 loops=1) -> Bitmap Heap Scan on post (cost=1422.95..31148.91 rows=9808 width=4) (actual time=1.440..1.440 rows=0 loops=1) Recheck Cond: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) -> Bitmap Index Scan on idx_message (cost=0.00..1420.50 rows=9808 width=0) (actual time=1.438..1.438 rows=0 loops=1) Index Cond: (to_tsvector('english'::text, message) @@ to_tsquery('hosehead'::text)) -> Index Scan using post_ordinal on post (cost=0.00..6.41 rows=1 width=436) (never executed) Index Cond: (public.post.ordinal = public.post.ordinal) Total runtime: 1.600 ms (13 rows) Why is the planner "taking into consideration" the LIMIT (I know the docs say it does) and choosing to sequentially scan a table of nearly 2 million rows?! I don't see how that makes sense.... irrespective of the query being LIMITed. If it matters setting enable_seqscan OFF does not impact the results. -- Karl
begin:vcard fn:Karl Denninger n:Denninger;Karl email;internet:karl@xxxxxxxxxxxxx x-mozilla-html:TRUE version:2.1 end:vcard
-- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance