Hi, In my DB, there is a large table containing messages and one that contains message boxes. Messages are assigned to boxes via a child parent link m->b. In order to obtain the last message for a specific box, I use the following SQL: SELECT m1.id FROM message m1 LEFT JOIN message m2 ON (m1.box_id = m2.box_id AND m1.id < m2.id) WHERE m2.id IS NULL AND m1.box_id = id; This worked quite well for a long time. But now, suddenly the performance of this query drastically degraded as new messages were added. If these new messages are removed again, everything is back to normal. If other messages are removed instead, the problem remains, so it does not seem to be a memory issue. I fear I have difficulties to understand what is going wrong here. This is the query plan when everything is fine: "Seq Scan on public.box this_ (cost=0.00..10467236.32 rows=128 width=696) (actual time=0.169..7683.978 rows=128 loops=1)" " Output: this_.id, this_.login, (SubPlan 1)" " Buffers: shared hit=188413 read=94635 written=135, temp read=22530 written=22374" " SubPlan 1" " -> Hash Anti Join (cost=41323.25..81775.25 rows=20427 width=8) (actual time=59.571..59.877 rows=1 loops=128)" " Output: m1.id" " Hash Cond: (m1.box_id = m2.box_id)" " Join Filter: (m1.id < m2.id)" " Buffers: shared hit=188412 read=94633 written=135, temp read=22530 written=22374" " -> Bitmap Heap Scan on public.message m1 (cost=577.97..40212.28 rows=30640 width=16) (actual time=3.152..9.514 rows=17982 loops=128)" " Output: m1.id, m1.box_id" " Recheck Cond: (m1.box_id = $0)" " Buffers: shared hit=131993 read=9550 written=23" " -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.840..2.840 rows=18193 loops=128)" " Index Cond: (m1.box_id = $0)" " Buffers: shared hit=314 read=6433 written=23" " -> Hash (cost=40212.28..40212.28 rows=30640 width=16) (actual time=26.840..26.840 rows=20014 loops=115)" " Output: m2.box_id, m2.id" " Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 5444kB" " Buffers: shared hit=56419 read=85083 written=112, temp written=7767" " -> Bitmap Heap Scan on public.message m2 (cost=577.97..40212.28 rows=30640 width=16) (actual time=2.419..20.007 rows=20014 loops=115)" " Output: m2.box_id, m2.id" " Recheck Cond: (m2.box_id = $0)" " Buffers: shared hit=56419 read=85083 written=112" " -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.166..2.166 rows=20249 loops=115)" " Index Cond: (m2.box_id = $0)" " Buffers: shared hit=6708" "Total runtime: 7685.202 ms" This is the plan when the query gets sluggish: "Seq Scan on public.box this_ (cost=0.00..10467236.32 rows=128 width=696) (actual time=0.262..179333.086 rows=128 loops=1)" " Output: this_.id, this_.login, (SubPlan 1)" " Buffers: shared hit=189065 read=93983 written=10, temp read=22668 written=22512" " SubPlan 1" " -> Hash Anti Join (cost=41323.25..81775.25 rows=20427 width=8) (actual time=1264.700..1400.886 rows=1 loops=128)" " Output: m1.id" " Hash Cond: (m1.box_id = m2.box_id)" " Join Filter: (m1.id < m2.id)" " Buffers: shared hit=189064 read=93981 written=10, temp read=22668 written=22512" " -> Bitmap Heap Scan on public.message m1 (cost=577.97..40212.28 rows=30640 width=16) (actual time=3.109..9.850 rows=18060 loops=128)" " Output: m1.id, m1.box_id" " Recheck Cond: (m1.box_id = $0)" " Buffers: shared hit=132095 read=9448" " -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.867..2.867 rows=18193 loops=128)" " Index Cond: (m1.box_id = $0)" " Buffers: shared hit=312 read=6435" " -> Hash (cost=40212.28..40212.28 rows=30640 width=16) (actual time=27.533..27.533 rows=20102 loops=115)" " Output: m2.box_id, m2.id" " Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 5522kB" " Buffers: shared hit=56969 read=84533 written=10, temp written=7811" " -> Bitmap Heap Scan on public.message m2 (cost=577.97..40212.28 rows=30640 width=16) (actual time=2.406..20.492 rows=20102 loops=115)" " Output: m2.box_id, m2.id" " Recheck Cond: (m2.box_id = $0)" " Buffers: shared hit=56969 read=84533 written=10" " -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.170..2.170 rows=20249 loops=115)" " Index Cond: (m2.box_id = $0)" " Buffers: shared hit=6708" "Total runtime: 179334.310 ms" So from my limited experience, the only significant difference I see is that the Hash Anti Join takes a lot more time in plan 2, but I do not understand why. Ideas somebody? Thanks panam -- View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4420974.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance