On Thu, Nov 14, 2019 at 2:29 PM Andres Freund <andres@xxxxxxxxxxx> wrote:
Hi,
On 2019-11-14 14:19:51 -0800, Craig James wrote:
> I'm completely baffled by this problem: I'm doing a delete that joins three
> modest-sized tables, and it gets completely stuck: 100% CPU use forever.
> Here's the query:
I assume this is intended to be an equivalent SELECT? Because you did
mention DELETE, but I'm not seeing one here? Could you actually show
that query - surely that didn't include a count() etc... You can
EPLAIN DELETEs too.
Sorry, my explanation was misleading. It is a "delete ... where id in (select ...)". But I discovered that the select part itself never completes, whether you include it in the delete or not. So I only showed the select, which I converted to a "select count(1) ..." for simplicity.
> explain analyze
> select count(1) from registry.categories
> where category_id = 15 and id in
> (select c.id from registry.categories c
> left join registry.category_staging_15 st on (c.id = st.id) where
> c.category_id = 15 and st.id is null);
>
> If I leave out the "analyze", here's what I get (note that the
> categories_staging_N table's name changes every time; it's
> created on demand as "create table categories_staging_n(id integer)").
> Aggregate (cost=193.54..193.55 rows=1 width=8)
> -> Nested Loop Semi Join (cost=0.84..193.54 rows=1 width=0)
> Join Filter: (categories.id = c.id)
> -> Index Scan using i_categories_category_id on categories
> (cost=0.42..2.44 rows=1 width=4)
> Index Cond: (category_id = 23)
> -> Nested Loop Anti Join (cost=0.42..191.09 rows=1 width=4)
> Join Filter: (c.id = st.id)
> -> Index Scan using i_categories_category_id on categories c
> (cost=0.42..2.44 rows=1 width=4)
> Index Cond: (category_id = 23)
> -> Seq Scan on category_staging_23 st (cost=0.00..99.40
> rows=7140 width=4)
>
> The tables are small. From a debugging printout:
Is categories.category_id unique?
No, categories.category_id is not unique. It has a b-tree index.
Does the plan change if you ANALYZE
the tables?
No. No difference.
But interestingly, it changes as the process goes forward. And it's inconsistent. Here's an example: it's going through several "categories" to update each. The first plan works, and it typically uses this plan a few times. But when selects the second plan, it gets stuck.
----------------
15994 items in table registry.category_staging_15
245598 items in table registry.categories
309398 items in table registry.smiles
15994 items in joined registry.category_staging_15 / registry.categories
0 items to be inserted
inserted: 0E0
EXPLAIN: Aggregate (cost=3464.82..3464.83 rows=1 width=8)
EXPLAIN: -> Hash Semi Join (cost=2029.16..3464.05 rows=311 width=0)
EXPLAIN: Hash Cond: (categories.id = c.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories (cost=0.42..1405.28 rows=7900 width=4)
EXPLAIN: Index Cond: (category_id = 15)
EXPLAIN: -> Hash (cost=1933.44..1933.44 rows=7624 width=4)
EXPLAIN: -> Hash Anti Join (cost=431.28..1933.44 rows=7624 width=4)
EXPLAIN: Hash Cond: (c.id = st.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories c (cost=0.42..1405.28 rows=7900 width=4)
EXPLAIN: Index Cond: (category_id = 15)
EXPLAIN: -> Hash (cost=230.94..230.94 rows=15994 width=4)
EXPLAIN: -> Seq Scan on category_staging_15 st (cost=0.00..230.94 rows=15994 width=4)
0 items deleted
7997 items inserted
----------------
6250 items in table registry.category_staging_25
245598 items in table registry.categories
309398 items in table registry.smiles
6250 items in joined registry.category_staging_25 / registry.categories
6250 items to be inserted
inserted: 3125
EXPLAIN: Aggregate (cost=173.51..173.52 rows=1 width=8)
EXPLAIN: -> Nested Loop Semi Join (cost=0.84..173.51 rows=1 width=0)
EXPLAIN: Join Filter: (categories.id = c.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories (cost=0.42..2.44 rows=1 width=4)
EXPLAIN: Index Cond: (category_id = 25)
EXPLAIN: -> Nested Loop Anti Join (cost=0.42..171.06 rows=1 width=4)
EXPLAIN: Join Filter: (c.id = st.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories c (cost=0.42..2.44 rows=1 width=4)
EXPLAIN: Index Cond: (category_id = 25)
EXPLAIN: -> Seq Scan on category_staging_25 st (cost=0.00..90.50 rows=6250 width=4)
15994 items in table registry.category_staging_15
245598 items in table registry.categories
309398 items in table registry.smiles
15994 items in joined registry.category_staging_15 / registry.categories
0 items to be inserted
inserted: 0E0
EXPLAIN: Aggregate (cost=3464.82..3464.83 rows=1 width=8)
EXPLAIN: -> Hash Semi Join (cost=2029.16..3464.05 rows=311 width=0)
EXPLAIN: Hash Cond: (categories.id = c.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories (cost=0.42..1405.28 rows=7900 width=4)
EXPLAIN: Index Cond: (category_id = 15)
EXPLAIN: -> Hash (cost=1933.44..1933.44 rows=7624 width=4)
EXPLAIN: -> Hash Anti Join (cost=431.28..1933.44 rows=7624 width=4)
EXPLAIN: Hash Cond: (c.id = st.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories c (cost=0.42..1405.28 rows=7900 width=4)
EXPLAIN: Index Cond: (category_id = 15)
EXPLAIN: -> Hash (cost=230.94..230.94 rows=15994 width=4)
EXPLAIN: -> Seq Scan on category_staging_15 st (cost=0.00..230.94 rows=15994 width=4)
0 items deleted
7997 items inserted
----------------
6250 items in table registry.category_staging_25
245598 items in table registry.categories
309398 items in table registry.smiles
6250 items in joined registry.category_staging_25 / registry.categories
6250 items to be inserted
inserted: 3125
EXPLAIN: Aggregate (cost=173.51..173.52 rows=1 width=8)
EXPLAIN: -> Nested Loop Semi Join (cost=0.84..173.51 rows=1 width=0)
EXPLAIN: Join Filter: (categories.id = c.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories (cost=0.42..2.44 rows=1 width=4)
EXPLAIN: Index Cond: (category_id = 25)
EXPLAIN: -> Nested Loop Anti Join (cost=0.42..171.06 rows=1 width=4)
EXPLAIN: Join Filter: (c.id = st.id)
EXPLAIN: -> Index Scan using i_categories_category_id on categories c (cost=0.42..2.44 rows=1 width=4)
EXPLAIN: Index Cond: (category_id = 25)
EXPLAIN: -> Seq Scan on category_staging_25 st (cost=0.00..90.50 rows=6250 width=4)
This plan doesn't look like it'd actually take long, if the estimates
are correct.
Another data point: during this query, Postgres is burning 100% CPU and doing no I/O. Pretty much for hours if I let it go.
Thanks for your help,
Craig