On Wed, Nov 30, 2016 at 11:05 AM, George <pinkisntwell@xxxxxxxxx> wrote: > On Wed, Nov 30, 2016 at 6:45 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: >> On Wed, Nov 30, 2016 at 10:42 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >>> George <pinkisntwell@xxxxxxxxx> writes: >>>> explain select * from wg3ppbm_transaction where partner_uuid in ( >>>> select p.uuid >>>> from wg3ppbm_userpartner up >>>> join wg3ppbm_partner p on p.id = up.partner_id >>>> ); >>> >>>> "Hash Semi Join (cost=2.07..425.72 rows=2960 width=482)" >>>> " Hash Cond: ((wg3ppbm_transaction.partner_uuid)::text = (p.uuid)::text)" >>>> " -> Seq Scan on wg3ppbm_transaction (cost=0.00..375.19 rows=5919 width=482)" >>>> " -> Hash (cost=2.06..2.06 rows=1 width=37)" >>>> " -> Nested Loop (cost=0.00..2.06 rows=1 width=37)" >>>> " Join Filter: (up.partner_id = p.id)" >>>> " -> Seq Scan on wg3ppbm_userpartner up >>>> (cost=0.00..1.01 rows=1 width=4)" >>>> " -> Seq Scan on wg3ppbm_partner p (cost=0.00..1.02 >>>> rows=2 width=41)" >>> >>> This plan is expecting to have to return about half of the rows in >>> wg3ppbm_transaction, a situation for which an indexscan would NOT >>> be a better choice. The usual rule of thumb is that you need to be >>> retrieving at most one or two percent of a table's rows for an indexscan >>> on it to be faster than a seqscan. >>> >>> I think however that the "half" may be a default estimate occasioned >>> by the other tables being empty and therefore not having any statistics. >>> Another rule of thumb is that the plans you get for tiny tables have >>> little to do with what happens once there's lots of data. >> >> Yeah, don't make query plan assumptions against empty or nearly empty >> tables. As the data grows, the plans will suitably change. Perhaps >> OP just recently loaded a bunch of data and the tables haven't been >> analyzed yet? > > I just added a significant number of rows to the table. I now have > 1.3M rows in total but only 8K rows that contain the value I am > seeking. I also ran ANALYZE after loading the data. The query plans > for the two queries did not change. Also, the simple query returns in > 45 ms while the one with the subquery needs 1.5 s, i.e. it is about > 30x slower. > > So there is definitely something wrong here. This situation makes many > row-level security use cases cumbersome since you need to have > almost the same WHERE clause both in the row-level security policy and > in every SELECT query in order for the index to be used. can you give EXPLAIN ANALYZE for the 'good' query and the 'bad' query? merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general