Who is causing all this i/o?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Our development server (PG 8.4.4 on Ubuntu server) is constantly doing something, and I can't figure out what.  The two production servers, which are essentially identical, don't show these symptoms.  In a nutshell, it's showing 10K blocks per second of data going out, all the time, and essentially zero blocks per second of input.

To start with, if I turn off Postgres, everything is quiet:

$ /etc/init.d/postgresql stop
Stopping PostgreSQL: server stopped
ok
[radon:root] /emi/logs/londiste # vmstat 2
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0 126140 1115944 112692 10466452    0    0   132   829    1    3  2  1 97  1
 0  0 126140 1116068 112692 10466452    0    0     0     0  150  175  0  0 100  0
 0  0 126140 1116184 112708 10466456    0    0     0    94  123  153  0  0 100  0
 0  0 126140 1116432 112708 10466456    0    0     0     0   96  126  0  0 100  0
 0  0 126140 1116928 112708 10466456    0    0     0     0   94  111  0  0 100  0
 0  0 126140 1116688 112708 10466456    0    0     0     0  114  144  0  0 100  0
 0  0 126140 1116364 112708 10466456    0    0    18    74  229  266  1  0 99  0
 0  0 126140 1116488 112708 10466492    0    0     0     0  111  138  0  0 100  0
 1  0 126140 1116868 112716 10466492    0    0     0    20   96  121  0  0 100  0

But when I restart Postgres, it looks like this, and runs this way 24/7.  Notice the "bo" column:

$ /etc/init.d/postgresql start
Starting PostgreSQL: ok
$ vmstat 2
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0 126140 1078020 112744 10480652    0    0   132   829    1    3  2  1 97  1
 0  1 126140 1054320 112744 10487792    0    0    52 10714  900  976  2  1 97  0
 0  1 126140 1015988 112760 10511060    0    0  1828  9620 1491 3527  5  2 87  6
 0  1 126140 1008316 112768 10517936    0    0   672 10994  982 1209  1  2 86 11
 0  0 126140 1017780 112776 10528824    0    0  1466 10746 1003 1428  2  1 89  8
 0  0 126140 1013324 112784 10533680    0    0     0  9718  766  858  1  0 99  0
 1  0 126140 991600 112784 10539220    0    0     0 10714  962 1127  2  1 97  0
 0  0 126140 999924 112788 10544380    0    0     6 10726  918 1227  2  1 97  0
 0  0 126140 995700 112800 10548816    0    0     0  9744  823  889  1  1 99  0
 2  0 126140 964560 112804 10554048    0    0     2 10718  903 1196  3  1 96  0

That seems to be a lot of output data going somewhere, but there doesn't seem to be anything going on that should cause it:

postgres=# select datname, procpid, usename, substr(current_query,0,25) current_query, waiting from pg_stat_activity;
  datname   | procpid | usename  |      current_query       | waiting
------------+---------+----------+--------------------------+---------
 postgres   |   12927 | postgres | select datname, procpid, | f
 global     |   12185 | postgres | <IDLE>                   | f
 accounting |   12367 | postgres | <IDLE>                   | f
 orders     |   12225 | orders   | <IDLE>                   | f
 archive    |   12267 | postgres | <IDLE>                   | f
(5 rows)

After a lot of digging around, I found this in the /postgres/pg_stat_tmp directory.  If I list the directory including the i-nodes once every second, I find that a new 2MB file is being created roughly once every two seconds:

# ls -li
total 2384
69731459 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731459 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731459 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 3332
69731258 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
69731459 -rw------- 1 postgres postgres  970752 2011-05-20 14:07 pgstat.tmp
# ls -li
total 2384
69731459 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw------- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat

This is a development server, so there are no queries coming in from the web or other random activities.  Nothing is going on.

The other two servers are configured identically.  If I diff the configuration files, the only difference is the IP addresses for the "listen" section.

Can anyone tell me what's going on?  Why is pgstat.stat being rewritten on this server constantly and not on the other two servers?

Thanks,
Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux