Re: Unusually high IO for autovacuum worker

Vlad Bailescu <vlad@xxxxxxxxxxxxxxxxxx> · Fri, 1 Feb 2013 11:54:39 +0200

On Fri, Feb 1, 2013 at 11:01 AM, Pavan Deolasee <pavan.deolasee@xxxxxxxxx> wrote:

Do you know for sure that its the master table that generating all the

vacuum traffic ? What about the partition tables ? Do they get any

updates/deletes ? It might be useful if you could turn autovacuum

logging ON and see which tables are getting vacuumed and correlate

that with the increased IO activity that you're seeing.

Did you change any other parameters like checkpoint timeout/segments

etc ? It might be worthwhile to log checkpoint activities as well to

be doubly sure.

Thanks,

Pavan

 I'm pretty sure the io is from the autovacuum on master table because 
it's last_autovacuum stats update almost every minute and iotop shows 
something like:

Total DISK READ:       5.80 M/s | Total DISK WRITE:     115.85 K/s

  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND                                                                                                              
 7681 be/4 postgres    5.93 M/s    0.00 B/s  0.00 % 63.62 % postgres: autovacuum worker process   fleet

15837 be/4 postgres    0.00 B/s   23.17 K/s  0.00 %  4.98 % postgres: fleet fleet 127.0.0.1(53537) idle
16683 be/4 postgres    0.00 B/s   23.17 K/s  0.00 %  2.68 % postgres: fleet fleet 127.0.0.1(53978) idle
18599 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  1.25 % postgres: fleet fleet 127.0.0.1(59529) idle

15608 be/4 postgres    0.00 B/s   23.17 K/s  0.00 %  0.04 % postgres: fleet fleet 127.0.0.1(53349) idle
10253 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet fleet 127.0.0.1(50639) idle
16524 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet fleet 127.0.0.1(53939) idle

10248 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet fleet 127.0.0.1(50634) idle
18606 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet fleet 127.0.0.1(59535) idle
14849 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet fleet 127.0.0.1(52832) idle

30308 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: wal writer process
21528 be/4 www-data    0.00 B/s    3.86 K/s  0.00 %  0.00 % nginx: worker process
30307 be/4 postgres    0.00 B/s   38.62 K/s  0.00 %  0.00 % postgres: writer process

I just checked and our config only sets:

max_connections = 210
superuser_reserved_connections = 10

wal_buffers = 8MB

checkpoint_segments = 16
checkpoint_completion_target = 0.9

random_page_cost = 3.0

shared_buffers = 9GB
effective_cache_size = 28GB
work_mem = 88MB
maintenance_work_mem = 1GB

We're running on a server with 40GB RAM and a RAID 10 over 6 10k HDDs.

Stats show only 3 tables get frequently autovacuumed

  Table Name
  Tuples inserted
  Tuples updated
  Tuples deleted
  Tuples HOT updated
  Live Tuples
  Dead Tuples
  Last vacuum
  Last autovacuum
  Last analyze
  Last autoanalyze
  Vacuum counter
  Autovacuum counter
  Analyze counter
  Autoanalyze counter
  Size

  vehicle_position
  4689127
  0
  4689127
  0
  0
  1985

  2013-02-01 11:46:46.441227+02
  2013-01-31 14:48:00.775864+02
  2013-02-01 11:46:46.494234+02
  0
  3761
  1
  3754
  4016 kB

  vehicle
  2124
  934640
  0
  924003
  2124
  390

  2013-02-01 11:47:46.067695+02
  2013-02-01 11:47:46.325444+02
  0
  1293
  0
  2038
  2448 kB

  input_current
  10032
  1649206
  0
  1635902
  10032
  1728

  2013-02-01 11:45:46.0681+02
  2013-02-01 11:45:46.215048+02
  0
  442
  0
  1294
  2336 kB

Table Name	Tuples inserted	Tuples updated	Tuples deleted	Tuples HOT updated	Live Tuples	Dead Tuples	Last vacuum	Last autovacuum	Last analyze	Last autoanalyze	Vacuum counter	Autovacuum counter	Analyze counter	Autoanalyze counter	Size
vehicle_position	4689127	0	4689127	0	0	1985		2013-02-01 11:46:46.441227+02	2013-01-31 14:48:00.775864+02	2013-02-01 11:46:46.494234+02	0	3761	1	3754	4016 kB
vehicle	2124	934640	0	924003	2124	390		2013-02-01 11:47:46.067695+02		2013-02-01 11:47:46.325444+02	0	1293	0	2038	2448 kB
input_current	10032	1649206	0	1635902	10032	1728		2013-02-01 11:45:46.0681+02		2013-02-01 11:45:46.215048+02	0	442	0	1294	2336 kB