On Mon, Feb 1, 2010 at 7:53 PM, Matt White <mattw922@xxxxxxxxx> wrote: > I have a relatively straightforward query that by itself isn't that > slow, but we have to run it up to 40 times on one webpage load, so it > needs to run much faster than it does. Here it is: > > SELECT COUNT(*) FROM users, user_groups > WHERE users.user_group_id = user_groups.id AND NOT users.deleted AND > user_groups.partner_id IN > (partner_id_1, partner_id_2); > > The structure is partners have user groups which have users. In the > test data there are over 200,000 user groups and users but only ~3000 > partners. Anyone have any bright ideas on how to speed this query up? > > Here's the query plan: > > Aggregate (cost=12574.53..12574.54 rows=1 width=0) (actual > time=2909.298..2909.299 rows=1 loops=1) > -> Hash Join (cost=217.79..12566.08 rows=3378 width=0) (actual > time=2909.284..2909.284 rows=0 loops=1) > Hash Cond: (users.user_group_id = user_groups.id) > -> Seq Scan on users (cost=0.00..11026.11 rows=206144 > width=4) (actual time=0.054..517.811 rows=205350 loops=1) > Filter: (NOT deleted) > -> Hash (cost=175.97..175.97 rows=3346 width=4) (actual > time=655.054..655.054 rows=200002 loops=1) > -> Nested Loop (cost=0.27..175.97 rows=3346 width=4) > (actual time=1.327..428.406 rows=200002 loops=1) > -> HashAggregate (cost=0.27..0.28 rows=1 > width=4) (actual time=1.259..1.264 rows=2 loops=1) > -> Result (cost=0.00..0.26 rows=1 > width=0) (actual time=1.181..1.240 rows=2 loops=1) > -> Index Scan using user_groups_partner_id_idx > on user_groups (cost=0.00..133.86 rows=3346 width=8) (actual > time=0.049..96.992 rows=100001 loops=2) > Index Cond: (user_groups.partner_id = > (partner_all_subpartners(3494))) > > > The one obvious thing that everyone will point out is the sequential > scan on users, but there actually is an index on users.deleted. When I > forced sequential scanning off, it ran slower, so the planner wins > again. Yeah, I don't think the sequential scan is hurting you. What is bugging me is that it doesn't look like the plan you've posted is for the query you've posted. The plan shows an index condition that references partner_all_subpartners(3494), which doesn't appear in your original query, and also has two aggregates in it, where your posted query only has one. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance