On Sat, Nov 2, 2013 at 1:54 PM, Евгений Селявка <evg.selyavka@xxxxxxxxx> wrote: > Please help with advice! > > Server > HP ProLiant BL460c G1 > > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 8 > On-line CPU(s) list: 0-7 > Thread(s) per core: 1 > Core(s) per socket: 4 > CPU socket(s): 2 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 23 > Stepping: 6 > CPU MHz: 3000.105 > BogoMIPS: 6000.04 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 6144K > NUMA node0 CPU(s): 0-7 > > 32GB RAM > [root@db3 ~]# numactl --hardware > available: 1 nodes (0) > node 0 cpus: 0 1 2 3 4 5 6 7 > node 0 size: 32765 MB > node 0 free: 317 MB > node distances: > node 0 > 0: 10 > > > RAID1 2x146GB 10k rpm > > CentOS release 6.3 (Final) > Linux 2.6.32-279.11.1.el6.x86_64 #1 SMP x86_64 GNU/Linux > > > kernel.msgmnb = 65536 > kernel.msgmax = 65536 > kernel.shmmax = 68719476736 > kernel.shmall = 4294967296 > vm.swappiness = 30 > vm.dirty_background_bytes = 67108864 > vm.dirty_bytes = 536870912 > > > PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 > 20120305 (Red Hat 4.4.6-4), 64-bit > > listen_addresses = '*' > port = 5433 > max_connections = 350 > shared_buffers = 8GB > temp_buffers = 64MB > max_prepared_transactions = 350 > work_mem = 256MB > maintenance_work_mem = 1GB > max_stack_depth = 4MB > max_files_per_process = 5000 > effective_io_concurrency = 2 > wal_level = hot_standby > synchronous_commit = off > checkpoint_segments = 64 > checkpoint_timeout = 15min > checkpoint_completion_target = 0.75 > max_wal_senders = 4 > wal_sender_delay = 100ms > wal_keep_segments = 128 > random_page_cost = 3.0 > effective_cache_size = 18GB > autovacuum = on > autovacuum_max_workers = 5 > autovacuum_vacuum_threshold = 900 > autovacuum_analyze_threshold = 350 > autovacuum_vacuum_scale_factor = 0.1 > autovacuum_analyze_scale_factor = 0.05 > log_min_duration_statement = 500 > deadlock_timeout = 1s > > > DB size is about 20GB. There is no high write activity on DB. But > periodically in postgresql log i see for example: "select 1" duration is > about 500-1000 ms. > > In this period of time response time from db terribly. This period of time > not bound with high traffic. It is not other app on the server. There is not > specific cron job on server. > > Our app written on java and use jdbc to connect to DB and internal pooling. > There is about 100 connection to DB. This is sar output: > > 12:00:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s > pgscand/s pgsteal/s %vmeff > 09:30:01 PM 73.17 302.72 134790.16 0.00 46809.73 0.00 > 0.00 0.00 0.00 > 09:35:01 PM 63.42 655.80 131740.74 0.00 46182.74 0.00 > 0.00 0.00 0.00 > 09:40:01 PM 76.87 400.62 122375.34 0.00 42096.27 0.00 > 0.00 0.00 0.00 > 09:45:01 PM 58.49 198.33 121922.86 0.00 42765.27 0.00 > 0.00 0.00 0.00 > 09:50:01 PM 52.21 485.45 136775.65 0.15 49098.65 0.00 > 0.00 0.00 0.00 > 09:55:01 PM 49.68 476.75 130159.24 0.00 45192.54 0.00 > 0.00 0.00 0.00 > 10:00:01 PM 41.35 295.34 118655.80 0.00 40786.52 0.00 > 0.00 0.00 0.00 > 10:05:01 PM 60.84 593.85 129890.83 0.00 44170.92 0.00 > 0.00 0.00 0.00 > 10:10:01 PM 52.08 471.36 132773.63 0.00 46019.13 0.00 > 2.41 2.41 100.00 > 10:15:01 PM 73.93 196.50 129384.21 0.33 45255.76 65.92 > 1.19 66.87 99.64 > 10:20:02 PM 70.35 473.16 121940.38 0.11 44061.52 81.95 > 37.79 119.42 99.73 > 10:25:01 PM 57.84 471.69 130583.33 0.01 46093.33 0.00 > 0.00 0.00 0.00 > 10:30:01 PM 52.91 321.62 119264.34 0.01 41748.19 0.00 > 0.00 0.00 0.00 > 10:35:01 PM 47.13 451.78 114625.62 0.02 40600.98 0.00 > 0.00 0.00 0.00 > 10:40:01 PM 48.96 472.41 102352.79 0.00 35402.17 0.00 > 0.00 0.00 0.00 > 10:45:01 PM 70.07 321.33 121423.02 0.00 43052.04 0.00 > 0.00 0.00 0.00 > 10:50:01 PM 46.78 479.95 128938.09 0.02 37864.07 116.64 > 48.97 165.07 99.67 > 10:55:02 PM 104.84 453.55 109189.98 0.00 37583.50 0.00 > 0.00 0.00 0.00 > 11:00:01 PM 46.23 248.75 107313.26 0.00 37278.10 0.00 > 0.00 0.00 0.00 > 11:05:01 PM 44.28 446.41 115598.61 0.01 40070.61 0.00 > 0.00 0.00 0.00 > 11:10:01 PM 38.86 457.32 100240.71 0.00 34407.29 0.00 > 0.00 0.00 0.00 > 11:15:01 PM 48.23 275.60 104780.84 0.00 36183.84 0.00 > 0.00 0.00 0.00 > 11:20:01 PM 92.74 432.49 114698.74 0.01 40413.14 0.00 > 0.00 0.00 0.00 > 11:25:01 PM 42.76 428.50 87769.28 0.00 29379.87 0.00 > 0.00 0.00 0.00 > 11:30:01 PM 36.83 260.34 85072.46 0.00 28234.50 0.00 > 0.00 0.00 0.00 > 11:35:01 PM 62.52 481.56 93150.67 0.00 31137.13 0.00 > 0.00 0.00 0.00 > 11:40:01 PM 43.50 459.10 90407.34 0.00 30241.70 0.00 > 0.00 0.00 0.00 > > 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit > %commit > 09:30:01 PM 531792 32345400 98.38 475504 29583340 10211064 > 27.62 > 09:35:01 PM 512096 32365096 98.44 475896 29608660 10200916 > 27.59 > 09:40:01 PM 455584 32421608 98.61 476276 29638952 10211652 > 27.62 > 09:45:01 PM 425744 32451448 98.71 476604 29662384 10206044 > 27.60 > 09:50:01 PM 380960 32496232 98.84 477004 29684296 10243704 > 27.71 > 09:55:01 PM 385644 32491548 98.83 477312 29706940 10204776 > 27.60 > 10:00:01 PM 348604 32528588 98.94 477672 29725476 10228984 > 27.67 > 10:05:01 PM 279216 32597976 99.15 478104 29751016 10281748 > 27.81 > 10:10:01 PM 255168 32622024 99.22 478220 29769924 10247404 > 27.72 > 10:15:01 PM 321188 32556004 99.02 475124 29721912 10234500 > 27.68 > 10:20:02 PM 441660 32435532 98.66 472336 29610476 10246288 > 27.71 > 10:25:01 PM 440636 32436556 98.66 472636 29634960 10219940 > 27.64 > 10:30:01 PM 469872 32407320 98.57 473016 29651476 10208520 > 27.61 > 10:35:01 PM 414540 32462652 98.74 473424 29672728 10223964 > 27.65 > 10:40:01 PM 354632 32522560 98.92 473772 29693016 10247752 > 27.72 > 10:45:01 PM 333708 32543484 98.98 474092 29720256 10227204 > 27.66 > 10:50:01 PM 528004 32349188 98.39 469396 29549832 10219536 > 27.64 > 10:55:02 PM 499068 32378124 98.48 469692 29587140 10204836 > 27.60 > 11:00:01 PM 462980 32414212 98.59 470032 29606764 10235820 > 27.68 > 11:05:01 PM 449540 32427652 98.63 470368 29626136 10209788 > 27.61 > 11:10:01 PM 419984 32457208 98.72 470772 29644248 10214480 > 27.63 > 11:15:01 PM 429900 32447292 98.69 471104 29664292 10202344 > 27.59 > 11:20:01 PM 394852 32482340 98.80 471528 29698052 10207604 > 27.61 > 11:25:01 PM 345328 32531864 98.95 471904 29717264 10215632 > 27.63 > 11:30:01 PM 368224 32508968 98.88 472236 29733544 10206468 > 27.61 > 11:35:01 PM 321800 32555392 99.02 472528 29758548 10211820 > 27.62 > 11:40:01 PM 282520 32594672 99.14 472860 29776952 10243516 > 27.71 > > 12:00:01 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz > await svctm %util > 09:30:01 PM dev253-5 66.29 146.33 483.33 9.50 6.27 > 94.53 2.08 13.78 > 09:35:01 PM dev253-5 154.80 126.85 1192.96 8.53 28.57 > 184.59 1.45 22.43 > 09:40:01 PM dev253-5 92.21 153.75 686.75 9.11 11.53 > 125.00 1.87 17.21 > 09:45:01 PM dev253-5 39.66 116.99 279.32 9.99 0.42 > 10.66 2.61 10.36 > 09:50:01 PM dev253-5 106.73 95.58 820.70 8.58 16.77 > 157.12 1.68 17.88 > 09:55:01 PM dev253-5 107.90 99.36 831.46 8.63 16.05 > 148.76 1.71 18.42 > 10:00:01 PM dev253-5 62.48 82.70 471.28 8.87 5.91 > 94.52 2.10 13.11 > 10:05:01 PM dev253-5 137.84 121.69 1064.03 8.60 24.48 > 177.31 1.56 21.52 > 10:10:01 PM dev253-5 107.93 104.16 827.83 8.64 16.69 > 155.04 1.68 18.11 > 10:15:01 PM dev253-5 40.55 126.12 277.57 9.96 0.41 > 10.13 2.57 10.42 > 10:20:02 PM dev253-5 104.33 136.77 793.49 8.92 16.97 > 162.69 1.76 18.35 > 10:25:01 PM dev253-5 108.04 115.36 825.26 8.71 16.68 > 154.36 1.76 19.05 > 10:30:01 PM dev253-5 69.72 105.66 523.05 9.02 7.45 > 106.92 1.90 13.25 > 10:35:01 PM dev253-5 101.58 91.59 781.85 8.60 15.00 > 147.68 1.67 16.97 > 10:40:01 PM dev253-5 107.50 97.91 827.17 8.61 17.68 > 164.49 1.77 19.06 > 10:45:01 PM dev253-5 69.98 140.13 519.57 9.43 7.09 > 101.25 1.96 13.72 > 10:50:01 PM dev253-5 104.30 83.31 806.12 8.53 16.18 > 155.10 1.65 17.16 > 10:55:02 PM dev253-5 106.86 209.65 790.27 9.36 15.59 > 145.08 1.74 18.60 > 11:00:01 PM dev253-5 50.42 92.08 371.52 9.19 3.05 > 62.16 2.28 11.52 > 11:05:01 PM dev253-5 101.06 88.31 776.57 8.56 15.12 > 149.58 1.67 16.90 > 11:10:01 PM dev253-5 103.08 77.73 798.23 8.50 17.14 > 166.25 1.74 17.90 > 11:15:01 PM dev253-5 57.74 96.45 428.62 9.09 5.23 > 90.52 2.13 12.32 > 11:20:01 PM dev253-5 97.73 185.18 727.38 9.34 14.64 > 149.84 1.94 18.92 > 11:25:01 PM dev253-5 95.03 85.52 730.31 8.58 14.42 > 151.79 1.79 16.97 > 11:30:01 PM dev253-5 53.76 73.65 404.47 8.89 3.94 > 73.25 2.17 11.64 > 11:35:01 PM dev253-5 110.37 125.05 842.17 8.76 16.96 > 153.63 1.66 18.30 > 11:40:01 PM dev253-5 103.93 87.00 801.59 8.55 16.01 > 154.00 1.73 18.00 > > As you can see there is no high io activity in this period of time but db is > frozen. My opinion that i have incorrect kernel setting and/or i have a > mistake in postgresql.conf. Because there is not high activity on db. load > avg is about 1. When there is high traffic is about 1.15. This is from > nagios monitoring system. > > But sometimes load is about 4 and this time matches with sar %vmeff = 100% > and database response time increase. Need to see: iowait, system load. Also consider installing perf and grabbing a profile while issue is happening. Probably this problem will go way with 2GB shared buffers, but before doing that we'd like to diagnose this if possible. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance