Hi,
we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an hour, with disk I/O utilization (percent of time device is busy) at 100% the whole time, although I/O bandwidth is not saturated. This is on PostgreSQL 9.1.13.
What could cause this? Note that there is no ANALYZE. Is it possible that the SELECT is actually executed, in planning the INSERT?
When executing the INSERT itself (not EXPLAIN) immediately afterwards, that logs a "temporary file" message, but the EXPLAIN invocation does not (though the disk I/O suggests that a large on-disk sort is occurring):
LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp6016.0", size 744103936
STATEMENT: INSERT INTO f_foo SELECT
[...]
During that actual execution, there's a lower disk I/O utilization (though a higher MB/s rate).
Charts of disk I/O utilization and rate are at http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement ending at 6:32pm. Metrics are collected by New Relic; their definition of I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
Here's the EXPLAIN statement:
LOG: duration: 3928778.823 ms statement: EXPLAIN INSERT INTO f_foo SELECT
t_foo.fk_d1,
t_foo.fk_d2,
t_foo.fk_d3,
t_foo.fk_d4,
t_foo.fk_d5,
t_foo.fk_d6,
t_foo.value
FROM t_foo
WHERE NOT (EXISTS
(SELECT *
FROM f_foo
WHERE f_foo.fk_d2 = t_foo.fk_d2
AND f_foo.fk_d3 = t_foo.fk_d3
AND f_foo.fk_d4 = t_foo.fk_d4
AND f_foo.fk_d5 = t_foo.fk_d5
AND f_foo.fk_d6 = t_foo.fk_d6
AND f_foo.fk_d1 = t_foo.fk_d1))
(where t_foo is a temp table previously populated using COPY, and the NOT EXISTS subquery refers to the same table we are inserting into)
Here's the EXPLAIN output:
Insert on f_foo (cost=8098210.50..9354519.69 rows=1 width=16)
-> Merge Anti Join (cost=8098210.50..9354519.69 rows=1 width=16)
Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
(t_foo.fk_d3 = public.f_foo.fk_d3) AND
(t_foo.fk_d4 = public.f_foo.fk_d4) AND
(t_foo.fk_d5 = public.f_foo.fk_d5) AND
(t_foo.fk_d6 = public.f_foo.fk_d6) AND
(t_foo.fk_d1 = public.f_foo.fk_d1))
-> Sort (cost=3981372.25..4052850.70 rows=28591380 width=16)
Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, t_foo.fk_d5,
t_foo.fk_d6, t_foo.fk_d1
-> Seq Scan on t_foo (cost=0.00..440461.80 rows=28591380
width=16)
-> Sort (cost=4116838.25..4188025.36 rows=28474842 width=12)
Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3,
public.f_foo.fk_d4, public.f_foo.fk_d5,
public.f_foo.fk_d6, public.f_foo.fk_d1
-> Seq Scan on f_foo (cost=0.00..591199.42 rows=28474842
width=12)
The INSERT is indeed rather large (which is why we're issuing an EXPLAIN ahead of it to log the plan). So its long execution time is expected. But I want to understand why the EXPLAIN takes even longer.
The table looks like this:
\d f_foo
Table "public.f_foo"
Column | Type | Modifiers
--------+----------+-----------
fk_d1 | smallint | not null
fk_d2 | smallint | not null
fk_d3 | smallint | not null
fk_d4 | smallint | not null
fk_d5 | smallint | not null
fk_d6 | smallint | not null
value | integer |
Indexes:
"f_foo_pkey" PRIMARY KEY, btree (fk_d2, fk_d6, fk_d4, fk_d3, fk_d5, fk_d1) CLUSTER
"ix_f_foo_d4" btree (fk_d4)
"ix_f_foo_d3" btree (fk_d3)
"ix_f_foo_d5" btree (fk_d5)
"ix_f_foo_d6" btree (fk_d6)
Foreign-key constraints:
"f_foo_d2_fkey" FOREIGN KEY (fk_d2) REFERENCES d2(id) DEFERRABLE
"f_foo_d6_fkey" FOREIGN KEY (fk_d6) REFERENCES d6(id) DEFERRABLE
"f_foo_d5_fkey" FOREIGN KEY (fk_d5) REFERENCES d5(id) DEFERRABLE
"f_foo_d4_fkey" FOREIGN KEY (fk_d4) REFERENCES d4(id) DEFERRABLE
"f_foo_d3_fkey" FOREIGN KEY (fk_d3) REFERENCES d3(id) DEFERRABLE
Conceivably relevant (though I don't know how): this database has a very large number of table objects (1.3 million rows in pg_class). But other EXPLAINs are not taking anywhere near this long in this DB; the heavy EXPLAIN is only seen on INSERT into this and a couple of other tables with tens of millions of rows.
Any ideas?
Thanks, best regards,
- Gulli