We are using a third party monitoring solution called Zabbix with Postgresql 9.1.We are observing lots of sharelock problems possibly as a result of multiple, concurrent updates to the specific table called items.
2012-11-01 00:14:23 PDT zabbix zabbix 127.0.0.1 - DETAIL: Process 5983 waits for ShareLock on transaction 48349033; blocked by process 5992. Process 5992 waits for ShareLock on transaction 48349079; blocked by process 5983. Process 5983: update items set lastclock=1351754051,lastns=230190945,prevvalue=lastvalue,lastvalue='2.692934' where itemid=200589; update items set lastclock=1351754050,lastns=329344694,prevvalue=lastvalue,lastvalue='2080' where itemid=200650; update items set lastclock=1351754051,lastns=378134858,prevvalue=lastvalue,lastvalue='1' where itemid=200710; update items set lastclock=1351754051,lastns=381995730,prevorgvalue='56',prevvalue=lastvalue,lastvalue='0' where itemid=201150; update items set lastclock=1351754051,lastns=378331407,prevvalue=lastvalue,lastvalue='Unit: 0 Slot: 0 Port: 3 Gigabit - Level 0x6000001' where itemid=201300; update items set lastclock=1351754051,lastns=380237518,prevorgvalue='0',prevvalue=lastvalue,lastvalue='0' where itemid=201350; update items set lastclock=1351754051,lastns=378222316,prevvalue=lastvalue,lastvalue='0' where itemid=201670; update items set lastclock=1351754050,lastns=319926666,prevvalue=lastvalue,lastvalue='893' where itemid=201970; update items set lastclock=1351754051,lastns=393 Process 5992: update items set lastclock=1351754043,lastns=980316865,prevvalue=lastvalue,lastvalue='65381720064' where itemid=23343; update items set lastclock=1351754044,lastns=392778204,prevvalue=lastvalue,lastvalue='887279616' where itemid=23344; update items set lastclock=1351754044,lastns=3196233,prevvalue=lastvalue,lastvalue='1351754044' where itemid=23763; update items set lastclock=1351754044,lastns=867856940,prevvalue=lastvalue,lastvalue='92.097214' where itemid=23764; update items set lastclock=1351754043,lastns=998890879,prevorgvalue='94217753500',prevvalue=lastvalue,lastvalue='14024' where itemid=23943; update items set lastclock=1351754044,lastns=865008900,prevorgvalue='0',prevvalue=lastvalue,lastvalue='0' where itemid=23944; update items set lastclock=1351754043,lastns=970666655,prevorgvalue='66951216662',prevvalue=lastvalue,lastvalue='605816' where itemid=24003; update items set lastclock=1351754044,lastns=871362232,prevvalue=lastvalue,lastvalue='2715914240' where itemid=24004; update items set lastclock=13 2012-11-01 00:14:23 PDT zabbix zabbix 127.0.0.1 - HINT: See server log for query details. Here is our modifications to the default postgresql.conf file: listen_addresses = '*' max_connections = 300 # (change requires restart) shared_buffers = 40GB # min 128kB # WSUIT work_mem = 1024MB # WSU IT maintenance_work_mem = 512MB # WSU IT effective_io_concurrency = 4 synchronous_commit = off # immediate fsync at commit # WSUIT wal_buffers = 1024MB checkpoint_segments = 512 # in logfile segments, min 1, 16MB each # WSUIT checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0 # WSUIT effective_cache_size = 48GB # WSUIT logging_collector = on # Enable capturing of stderr and csvlog log_directory = '/var/log/pg_log' # directory where log files are written, WSUIT log_filename = 'postgresql-%a.log' # log file name pattern, log_truncate_on_rotation = on # If on, an existing log file of the log_rotation_size = 100MB # Automatic rotation of logfiles will log_min_error_statement = error # values in order of decreasing detail: log_min_duration_statement = 1000 # WSUIT -- Log statements over ten seconds log_line_prefix = '%t %u %d %h - ' # special values: log_lock_waits = on # WSUIT - Turn on logging waits autovacuum_max_workers = 6 # WSUIT -- Increase these for our larger tables datestyle = 'iso, mdy' lc_messages = 'en_US.UTF-8' # locale for system error message lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting postgres=# select version(); version -------------------------------------------------------------------------------------------------------------- PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit Hardware: 96 GB of RAM 4-15K RPM drives in a RAID 1+0 TOP output: top - 12:42:20 up 97 days, 21:17, 8 users, load average: 1.33, 1.19, 1.14 Tasks: 718 total, 7 running, 711 sleeping, 0 stopped, 0 zombie Cpu(s): 18.7%us, 1.1%sy, 0.0%ni, 80.1%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 99060040k total, 96891404k used, 2168636k free, 220860k buffers Swap: 16777208k total, 1467392k used, 15309816k free, 85556104k cached You can see the load is really low and virtually no I/O waits.
|