Hi, On 25.1.2015 07:38, Joe Van Dyk wrote: > > Here's one that's not quite as well: http://explain.depesz.com/s/SgT As Pavel already pointed out, the first problem is this part of the plan: Seq Scan on events e2 (cost=0.00..120,179.60 rows=4,450,241 width=51) (actual time=0.014..33,773.370 rows=4,450,865 loops=1) Filter: (product_id <> '81716'::citext) Consuming ~33 seconds of the runtime. If you can make this faster somehow (e.g. by getting rid of the citext cast), that'd be nice. Another issue is that the hashjoin is batched: Buckets: 65536 Batches: 8 Memory Usage: 46085kB The hash preparation takes ~40 seconds, so maybe try to give it a bit more memory - I assume you have work_mem=64MB, so try doubling that (ISTM 512MB should work with a single batch). Maybe this won't really improve the performance, though. It still has to process ~4.5M rows. Increasing the work mem could also result in switching to hash aggregate, making the sort (~30 seconds) unnecessary. Anyway, ISTM this works as expected, i.e. (a) with rare product_id values the queries are fast (b) with common product_id values the queries are slow That's expected, because (b) needs to process much more data. I don't think you can magically make it run as fast as (a). The best solution might be to keep a pre-aggregated results - I don't think you really need exact answers when recommending "similar" products. I also wonder if you really need to join the tables? I mean, what if you do something like this: CREATE TABLE events_aggregated AS SELECT site_id, array_agg(product_id) AS product_ids, count(nullif(e2.type='viewed', false)) view_count, count(nullif(e2.type='purchased', false)) purchase_count FROM events GROUP BY 1; and then using intarray with GIN indexes to query this table? Something like this: CREATE products_agg_idx ON aggregated USING GIN (product_ids gin__int_ops); SELECT * FROM events_aggregated WHERE product_ids @> ARRAY['82503']; regards -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance