On Thu, Nov 14, 2019 at 5:20 PM Craig James <cjames@xxxxxxxxxxxxxx> wrote:
I'm completely baffled by this problem: I'm doing a delete that joins three modest-sized tables, and it gets completely stuck: 100% CPU use forever. Here's the query:Aggregate (cost=193.54..193.55 rows=1 width=8)
-> Nested Loop Semi Join (cost=0.84..193.54 rows=1 width=0)
Join Filter: (categories.id = c.id)
-> Index Scan using i_categories_category_id on categories (cost=0.42..2.44 rows=1 width=4)
Index Cond: (category_id = 23)
-> Nested Loop Anti Join (cost=0.42..191.09 rows=1 width=4)
Join Filter: (c.id = st.id)
-> Index Scan using i_categories_category_id on categories c (cost=0.42..2.44 rows=1 width=4)
Index Cond: (category_id = 23)
-> Seq Scan on category_staging_23 st (cost=0.00..99.40 rows=7140 width=4)
If the estimates were correct, this shouldn't be slow. But how can it screw up the estimate for this by much, when the conditions are so simple? How many rows are there actually in categories where category_id=23?
What do you see in `select * from pg_stats where tablename='categories' and attname='category_id' \x\g\x`?
Since it thinks the seq scan of
category_staging_23 is only going to happen once (at the bottom of two nested loops, but each executing just once) it sees no benefit in hashing that table. Of course it is actually happening a lot more than once.
Cheers,
Jeff