All my sar statistics
sar -r
11:40:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
01:15:01 PM 269108 32608084 99.18 367144 29707240 10289444 27.83
01:20:01 PM 293560 32583632 99.11 367428 29674272 10287136 27.82
01:25:01 PM 417640 32459552 98.73 366148 29563220 10289220 27.83
01:30:01 PM 399448 32477744 98.79 366812 29596520 10298876 27.85
01:35:01 PM 432332 32444860 98.69 367412 29616524 10277484 27.80
sar -d -p sar -r
11:40:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
01:15:01 PM 269108 32608084 99.18 367144 29707240 10289444 27.83
01:20:01 PM 293560 32583632 99.11 367428 29674272 10287136 27.82
01:25:01 PM 417640 32459552 98.73 366148 29563220 10289220 27.83
01:30:01 PM 399448 32477744 98.79 366812 29596520 10298876 27.85
01:35:01 PM 432332 32444860 98.69 367412 29616524 10277484 27.80
11:40:02 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
01:15:01 PM vg_root-lv_pgsql 73.10 116.59 540.15 8.98 6.98 95.44 1.61 11.78
01:20:01 PM vg_root-lv_pgsql 71.39 170.21 508.21 9.50 5.44 76.23 1.72 12.31
01:25:01 PM vg_root-lv_pgsql 54.32 136.21 381.53 9.53 3.58 65.98 1.81 9.85
01:30:01 PM vg_root-lv_pgsql 81.35 167.98 585.25 9.26 8.15 100.13 1.63 13.25
01:35:01 PM vg_root-lv_pgsql 66.75 126.02 482.72 9.12 5.59 83.73 1.78 11.90
sar -u ALL
11:40:02 AM CPU %usr %nice %sys %iowait %steal %irq %soft %guest %idle
01:15:01 PM all 8.57 0.00 1.52 1.46 0.00 0.00 0.05 0.00 88.40
01:20:01 PM all 8.50 0.00 1.53 1.61 0.00 0.00 0.05 0.00 88.31
01:25:01 PM all 9.00 0.00 1.78 1.27 0.00 0.00 0.06 0.00 87.89
01:30:01 PM all 9.58 0.00 1.63 1.71 0.00 0.00 0.06 0.00 87.01
01:35:01 PM all 8.75 0.00 1.47 1.57 0.00 0.00 0.06 0.00 88.15
01:20:01 PM all 8.50 0.00 1.53 1.61 0.00 0.00 0.05 0.00 88.31
01:25:01 PM all 9.00 0.00 1.78 1.27 0.00 0.00 0.06 0.00 87.89
01:30:01 PM all 9.58 0.00 1.63 1.71 0.00 0.00 0.06 0.00 87.01
01:35:01 PM all 8.75 0.00 1.47 1.57 0.00 0.00 0.06 0.00 88.15
As you say i install perf and get statistics with command
perf record -g -f -u postgres -e block:block_rq_*,syscalls:sys_enter_write,syscalls:sys_enter_fsync
perf record -g -f -u postgres -e block:block_rq_*,syscalls:sys_enter_write,syscalls:sys_enter_fsync
But i really don't understand perf report, what values i need to see. Could you help me with advice how to read perf report. What events from perf list i shoud trace, and what the good and bad values in this report depend of my hardware?
2013/11/7 Merlin Moncure <mmoncure@xxxxxxxxx>
Need to see: iowait, system load.On Sat, Nov 2, 2013 at 1:54 PM, Евгений Селявка <evg.selyavka@xxxxxxxxx> wrote:
> Please help with advice!
>
> Server
> HP ProLiant BL460c G1
>
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Byte Order: Little Endian
> CPU(s): 8
> On-line CPU(s) list: 0-7
> Thread(s) per core: 1
> Core(s) per socket: 4
> CPU socket(s): 2
> NUMA node(s): 1
> Vendor ID: GenuineIntel
> CPU family: 6
> Model: 23
> Stepping: 6
> CPU MHz: 3000.105
> BogoMIPS: 6000.04
> Virtualization: VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 6144K
> NUMA node0 CPU(s): 0-7
>
> 32GB RAM
> [root@db3 ~]# numactl --hardware
> available: 1 nodes (0)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 32765 MB
> node 0 free: 317 MB
> node distances:
> node 0
> 0: 10
>
>
> RAID1 2x146GB 10k rpm
>
> CentOS release 6.3 (Final)
> Linux 2.6.32-279.11.1.el6.x86_64 #1 SMP x86_64 GNU/Linux
>
>
> kernel.msgmnb = 65536
> kernel.msgmax = 65536
> kernel.shmmax = 68719476736
> kernel.shmall = 4294967296
> vm.swappiness = 30
> vm.dirty_background_bytes = 67108864
> vm.dirty_bytes = 536870912
>
>
> PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6
> 20120305 (Red Hat 4.4.6-4), 64-bit
>
> listen_addresses = '*'
> port = 5433
> max_connections = 350
> shared_buffers = 8GB
> temp_buffers = 64MB
> max_prepared_transactions = 350
> work_mem = 256MB
> maintenance_work_mem = 1GB
> max_stack_depth = 4MB
> max_files_per_process = 5000
> effective_io_concurrency = 2
> wal_level = hot_standby
> synchronous_commit = off
> checkpoint_segments = 64
> checkpoint_timeout = 15min
> checkpoint_completion_target = 0.75
> max_wal_senders = 4
> wal_sender_delay = 100ms
> wal_keep_segments = 128
> random_page_cost = 3.0
> effective_cache_size = 18GB
> autovacuum = on
> autovacuum_max_workers = 5
> autovacuum_vacuum_threshold = 900
> autovacuum_analyze_threshold = 350
> autovacuum_vacuum_scale_factor = 0.1
> autovacuum_analyze_scale_factor = 0.05
> log_min_duration_statement = 500
> deadlock_timeout = 1s
>
>
> DB size is about 20GB. There is no high write activity on DB. But
> periodically in postgresql log i see for example: "select 1" duration is
> about 500-1000 ms.
>
> In this period of time response time from db terribly. This period of time
> not bound with high traffic. It is not other app on the server. There is not
> specific cron job on server.
>
> Our app written on java and use jdbc to connect to DB and internal pooling.
> There is about 100 connection to DB. This is sar output:
>
> 12:00:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s
> pgscand/s pgsteal/s %vmeff
> 09:30:01 PM 73.17 302.72 134790.16 0.00 46809.73 0.00
> 0.00 0.00 0.00
> 09:35:01 PM 63.42 655.80 131740.74 0.00 46182.74 0.00
> 0.00 0.00 0.00
> 09:40:01 PM 76.87 400.62 122375.34 0.00 42096.27 0.00
> 0.00 0.00 0.00
> 09:45:01 PM 58.49 198.33 121922.86 0.00 42765.27 0.00
> 0.00 0.00 0.00
> 09:50:01 PM 52.21 485.45 136775.65 0.15 49098.65 0.00
> 0.00 0.00 0.00
> 09:55:01 PM 49.68 476.75 130159.24 0.00 45192.54 0.00
> 0.00 0.00 0.00
> 10:00:01 PM 41.35 295.34 118655.80 0.00 40786.52 0.00
> 0.00 0.00 0.00
> 10:05:01 PM 60.84 593.85 129890.83 0.00 44170.92 0.00
> 0.00 0.00 0.00
> 10:10:01 PM 52.08 471.36 132773.63 0.00 46019.13 0.00
> 2.41 2.41 100.00
> 10:15:01 PM 73.93 196.50 129384.21 0.33 45255.76 65.92
> 1.19 66.87 99.64
> 10:20:02 PM 70.35 473.16 121940.38 0.11 44061.52 81.95
> 37.79 119.42 99.73
> 10:25:01 PM 57.84 471.69 130583.33 0.01 46093.33 0.00
> 0.00 0.00 0.00
> 10:30:01 PM 52.91 321.62 119264.34 0.01 41748.19 0.00
> 0.00 0.00 0.00
> 10:35:01 PM 47.13 451.78 114625.62 0.02 40600.98 0.00
> 0.00 0.00 0.00
> 10:40:01 PM 48.96 472.41 102352.79 0.00 35402.17 0.00
> 0.00 0.00 0.00
> 10:45:01 PM 70.07 321.33 121423.02 0.00 43052.04 0.00
> 0.00 0.00 0.00
> 10:50:01 PM 46.78 479.95 128938.09 0.02 37864.07 116.64
> 48.97 165.07 99.67
> 10:55:02 PM 104.84 453.55 109189.98 0.00 37583.50 0.00
> 0.00 0.00 0.00
> 11:00:01 PM 46.23 248.75 107313.26 0.00 37278.10 0.00
> 0.00 0.00 0.00
> 11:05:01 PM 44.28 446.41 115598.61 0.01 40070.61 0.00
> 0.00 0.00 0.00
> 11:10:01 PM 38.86 457.32 100240.71 0.00 34407.29 0.00
> 0.00 0.00 0.00
> 11:15:01 PM 48.23 275.60 104780.84 0.00 36183.84 0.00
> 0.00 0.00 0.00
> 11:20:01 PM 92.74 432.49 114698.74 0.01 40413.14 0.00
> 0.00 0.00 0.00
> 11:25:01 PM 42.76 428.50 87769.28 0.00 29379.87 0.00
> 0.00 0.00 0.00
> 11:30:01 PM 36.83 260.34 85072.46 0.00 28234.50 0.00
> 0.00 0.00 0.00
> 11:35:01 PM 62.52 481.56 93150.67 0.00 31137.13 0.00
> 0.00 0.00 0.00
> 11:40:01 PM 43.50 459.10 90407.34 0.00 30241.70 0.00
> 0.00 0.00 0.00
>
> 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit
> %commit
> 09:30:01 PM 531792 32345400 98.38 475504 29583340 10211064
> 27.62
> 09:35:01 PM 512096 32365096 98.44 475896 29608660 10200916
> 27.59
> 09:40:01 PM 455584 32421608 98.61 476276 29638952 10211652
> 27.62
> 09:45:01 PM 425744 32451448 98.71 476604 29662384 10206044
> 27.60
> 09:50:01 PM 380960 32496232 98.84 477004 29684296 10243704
> 27.71
> 09:55:01 PM 385644 32491548 98.83 477312 29706940 10204776
> 27.60
> 10:00:01 PM 348604 32528588 98.94 477672 29725476 10228984
> 27.67
> 10:05:01 PM 279216 32597976 99.15 478104 29751016 10281748
> 27.81
> 10:10:01 PM 255168 32622024 99.22 478220 29769924 10247404
> 27.72
> 10:15:01 PM 321188 32556004 99.02 475124 29721912 10234500
> 27.68
> 10:20:02 PM 441660 32435532 98.66 472336 29610476 10246288
> 27.71
> 10:25:01 PM 440636 32436556 98.66 472636 29634960 10219940
> 27.64
> 10:30:01 PM 469872 32407320 98.57 473016 29651476 10208520
> 27.61
> 10:35:01 PM 414540 32462652 98.74 473424 29672728 10223964
> 27.65
> 10:40:01 PM 354632 32522560 98.92 473772 29693016 10247752
> 27.72
> 10:45:01 PM 333708 32543484 98.98 474092 29720256 10227204
> 27.66
> 10:50:01 PM 528004 32349188 98.39 469396 29549832 10219536
> 27.64
> 10:55:02 PM 499068 32378124 98.48 469692 29587140 10204836
> 27.60
> 11:00:01 PM 462980 32414212 98.59 470032 29606764 10235820
> 27.68
> 11:05:01 PM 449540 32427652 98.63 470368 29626136 10209788
> 27.61
> 11:10:01 PM 419984 32457208 98.72 470772 29644248 10214480
> 27.63
> 11:15:01 PM 429900 32447292 98.69 471104 29664292 10202344
> 27.59
> 11:20:01 PM 394852 32482340 98.80 471528 29698052 10207604
> 27.61
> 11:25:01 PM 345328 32531864 98.95 471904 29717264 10215632
> 27.63
> 11:30:01 PM 368224 32508968 98.88 472236 29733544 10206468
> 27.61
> 11:35:01 PM 321800 32555392 99.02 472528 29758548 10211820
> 27.62
> 11:40:01 PM 282520 32594672 99.14 472860 29776952 10243516
> 27.71
>
> 12:00:01 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz
> await svctm %util
> 09:30:01 PM dev253-5 66.29 146.33 483.33 9.50 6.27
> 94.53 2.08 13.78
> 09:35:01 PM dev253-5 154.80 126.85 1192.96 8.53 28.57
> 184.59 1.45 22.43
> 09:40:01 PM dev253-5 92.21 153.75 686.75 9.11 11.53
> 125.00 1.87 17.21
> 09:45:01 PM dev253-5 39.66 116.99 279.32 9.99 0.42
> 10.66 2.61 10.36
> 09:50:01 PM dev253-5 106.73 95.58 820.70 8.58 16.77
> 157.12 1.68 17.88
> 09:55:01 PM dev253-5 107.90 99.36 831.46 8.63 16.05
> 148.76 1.71 18.42
> 10:00:01 PM dev253-5 62.48 82.70 471.28 8.87 5.91
> 94.52 2.10 13.11
> 10:05:01 PM dev253-5 137.84 121.69 1064.03 8.60 24.48
> 177.31 1.56 21.52
> 10:10:01 PM dev253-5 107.93 104.16 827.83 8.64 16.69
> 155.04 1.68 18.11
> 10:15:01 PM dev253-5 40.55 126.12 277.57 9.96 0.41
> 10.13 2.57 10.42
> 10:20:02 PM dev253-5 104.33 136.77 793.49 8.92 16.97
> 162.69 1.76 18.35
> 10:25:01 PM dev253-5 108.04 115.36 825.26 8.71 16.68
> 154.36 1.76 19.05
> 10:30:01 PM dev253-5 69.72 105.66 523.05 9.02 7.45
> 106.92 1.90 13.25
> 10:35:01 PM dev253-5 101.58 91.59 781.85 8.60 15.00
> 147.68 1.67 16.97
> 10:40:01 PM dev253-5 107.50 97.91 827.17 8.61 17.68
> 164.49 1.77 19.06
> 10:45:01 PM dev253-5 69.98 140.13 519.57 9.43 7.09
> 101.25 1.96 13.72
> 10:50:01 PM dev253-5 104.30 83.31 806.12 8.53 16.18
> 155.10 1.65 17.16
> 10:55:02 PM dev253-5 106.86 209.65 790.27 9.36 15.59
> 145.08 1.74 18.60
> 11:00:01 PM dev253-5 50.42 92.08 371.52 9.19 3.05
> 62.16 2.28 11.52
> 11:05:01 PM dev253-5 101.06 88.31 776.57 8.56 15.12
> 149.58 1.67 16.90
> 11:10:01 PM dev253-5 103.08 77.73 798.23 8.50 17.14
> 166.25 1.74 17.90
> 11:15:01 PM dev253-5 57.74 96.45 428.62 9.09 5.23
> 90.52 2.13 12.32
> 11:20:01 PM dev253-5 97.73 185.18 727.38 9.34 14.64
> 149.84 1.94 18.92
> 11:25:01 PM dev253-5 95.03 85.52 730.31 8.58 14.42
> 151.79 1.79 16.97
> 11:30:01 PM dev253-5 53.76 73.65 404.47 8.89 3.94
> 73.25 2.17 11.64
> 11:35:01 PM dev253-5 110.37 125.05 842.17 8.76 16.96
> 153.63 1.66 18.30
> 11:40:01 PM dev253-5 103.93 87.00 801.59 8.55 16.01
> 154.00 1.73 18.00
>
> As you can see there is no high io activity in this period of time but db is
> frozen. My opinion that i have incorrect kernel setting and/or i have a
> mistake in postgresql.conf. Because there is not high activity on db. load
> avg is about 1. When there is high traffic is about 1.15. This is from
> nagios monitoring system.
>
> But sometimes load is about 4 and this time matches with sar %vmeff = 100%
> and database response time increase.
Also consider installing perf and grabbing a profile while issue is happening.
Probably this problem will go way with 2GB shared buffers, but before
doing that we'd like to diagnose this if possible.
merlin
--
С уважением Селявка Евгений