On Fri, May 6, 2016 at 10:33 AM, Alvaro Herrera <alvherre@xxxxxxxxxxxxxxx> wrote:
Hi Greg,
Greg Spiegelberg wrote:
> We were bit a couple months ago by a very similar issue where autovacuum
> ignored pg_type. A manual vacuum fixed it but since that table is abused
> by every client using libpq which is darn near everything I suspect it went
> bad in a hurry.
Hmm. The current report is about shared catalogs (pg_shdepend and
pg_database were reported as problematic) which pg_type is not, so I
doubt that this bugfix will have any impact in a problem vacuuming
pg_type.
I'm interested in seeing a more detailed report from you about the
pg_type vacuuming failure.
Not to steer the conversation elsewhere but the short of it is the system in question is a very active 9.3.5, well aware of more current versions, and in the early morning on Feb 8th before normal increase in activity and in the span of a few minutes queries went from normal sub-second responses to 30 sec to 5 minutes. Took hours to isolate because we never suspected pg_type until logging was turned up. We had to suspend operations and suspend pgbouncer to execute a manual VACUUM FULL. EXPLAIN ANALYZE before reported hundreds of thousands of rows before the vacuum and <2,000 after. All queries since have performed well since.
Stepping into the wayback machine, before VACUUM FULL:
Seq Scan on pg_type (cost=0.00..155188.24 rows=4788024 width=68) (actual time=0.007..40951.243 rows=1889 loops=1)
Total runtime: 40952.907 ms
Total runtime: 40952.907 ms
Attempted a simple VACUUM VERBOSE:
vacuum verbose pg_type;
INFO: vacuuming "pg_catalog.pg_type"
INFO: scanned index "pg_type_oid_index" to remove 860 row versions
DETAIL: CPU 0.98s/3.63u sec elapsed 96.52 sec.
INFO: scanned index "pg_type_typname_nsp_index" to remove 860 row versions
DETAIL: CPU 1.89s/7.14u sec elapsed 247.72 sec.
INFO: "pg_type": removed 860 row versions in 36 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.04 sec.
INFO: index "pg_type_oid_index" now contains 2014 row versions in 25824
pages
DETAIL: 715 index row versions were removed.
25717 index pages have been deleted, 25714 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_type_typname_nsp_index" now contains 2014 row versions in
35903 pages
DETAIL: 860 index row versions were removed.
35560 index pages have been deleted, 35488 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_type": found 163 removable, 1909 nonremovable row versions in
72 out of 107308 pages
DETAIL: 36 dead row versions cannot be removed yet.
There were 3056 unused item pointers.
0 pages are entirely empty.
CPU 2.88s/10.79u sec elapsed 344.29 sec.
INFO: "pg_type": stopping truncate due to conflicting lock request
VACUUM
INFO: vacuuming "pg_catalog.pg_type"
INFO: scanned index "pg_type_oid_index" to remove 860 row versions
DETAIL: CPU 0.98s/3.63u sec elapsed 96.52 sec.
INFO: scanned index "pg_type_typname_nsp_index" to remove 860 row versions
DETAIL: CPU 1.89s/7.14u sec elapsed 247.72 sec.
INFO: "pg_type": removed 860 row versions in 36 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.04 sec.
INFO: index "pg_type_oid_index" now contains 2014 row versions in 25824
pages
DETAIL: 715 index row versions were removed.
25717 index pages have been deleted, 25714 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_type_typname_nsp_index" now contains 2014 row versions in
35903 pages
DETAIL: 860 index row versions were removed.
35560 index pages have been deleted, 35488 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_type": found 163 removable, 1909 nonremovable row versions in
72 out of 107308 pages
DETAIL: 36 dead row versions cannot be removed yet.
There were 3056 unused item pointers.
0 pages are entirely empty.
CPU 2.88s/10.79u sec elapsed 344.29 sec.
INFO: "pg_type": stopping truncate due to conflicting lock request
VACUUM
The VACUUM FULL:
vacuum full verbose pg_type;
INFO: vacuuming "pg_catalog.pg_type"
INFO: "pg_type": found 42 removable, 1895 nonremovable row versions in 107308 pages
DETAIL: 4 dead row versions cannot be removed yet.
CPU 0.10s/0.11u sec elapsed 0.22 sec.
VACUUM
Time: 957.159 ms
INFO: vacuuming "pg_catalog.pg_type"
INFO: "pg_type": found 42 removable, 1895 nonremovable row versions in 107308 pages
DETAIL: 4 dead row versions cannot be removed yet.
CPU 0.10s/0.11u sec elapsed 0.22 sec.
VACUUM
Time: 957.159 ms
EXPLAIN ANALYZE then and now:
Seq Scan on pg_type (cost=0.00..680.17 rows=1917 width=754) (actual time=0.004..10.467 rows=1917 loops=1)
Total runtime: 10.999 ms
Autovacuum settings:
select name,setting from pg_settings where name ~ 'autov';
name | setting
-------------------------------------+-----------
autovacuum | on
autovacuum_analyze_scale_factor | 0.1
autovacuum_analyze_threshold | 50
autovacuum_freeze_max_age | 200000000
autovacuum_max_workers | 3
autovacuum_multixact_freeze_max_age | 400000000
autovacuum_naptime | 60
autovacuum_vacuum_cost_delay | 20
autovacuum_vacuum_cost_limit | -1
autovacuum_vacuum_scale_factor | 0.2
autovacuum_vacuum_threshold | 50
log_autovacuum_min_duration | -1
(12 rows)
I know, not much to go on.
> Question is, will this patch be backported to 9.3?
Yes, in my opinion we would backpatch it, back to 9.1 even.
Good news.
Thanks,
-Greg