On Sat, Nov 14, 2015 at 12:58 AM, Jamie Koceniak <jkoceniak@xxxxxxxxxxxxx> wrote: > Had the issue again today. > > Here is vmstat : > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 24 0 0 1591718656 605656 499370336 0 0 0 371 0 0 7 1 93 0 > 25 0 0 1591701376 605656 499371936 0 0 0 600 13975 20168 20 1 79 0 > 26 0 0 1591654784 605656 499372064 0 0 0 5892 12725 14627 20 1 79 0 > 25 0 0 1591614336 605656 499372128 0 0 0 600 11665 12642 21 1 78 0 > 27 0 0 1591549952 605656 499372192 0 0 0 408 16939 23387 23 1 76 0 > 29 0 0 1591675392 605656 499372288 0 0 0 836 15380 22564 23 1 76 0 > 27 0 0 1591608704 605656 499372352 0 0 0 456 17593 27955 23 1 76 0 > 34 0 0 1591524608 605656 499372480 0 0 0 5904 18963 30915 23 1 75 0 > 23 0 0 1591632384 605656 499372576 0 0 0 704 18190 31002 22 1 77 0 > 25 0 0 1591551360 605656 499372640 0 0 0 944 12532 14095 21 1 78 0 > 24 0 0 1591613568 605656 499372704 0 0 0 416 11183 12553 20 1 79 0 > 23 0 0 1591531520 605656 499372768 0 0 0 400 12648 15540 19 1 80 0 > 22 0 0 1591510528 605656 499372800 0 0 0 6024 14670 21993 19 1 80 0 > 31 0 0 1591388800 605656 499372896 0 0 0 472 20605 28242 20 1 79 0 > > We have a 120 CPU server :) > > processor : 119 > vendor_id : GenuineIntel > cpu family : 6 > model : 62 > model name : Intel(R) Xeon(R) CPU E7-4880 v2 @ 2.50GHz Per the numbers above. this server is very healthy. Something is not adding up here: I would really have liked to see a snapshot from 'top' and 'perf top' taken at the same time. Via top we could have seen if some of the processors were completely loaded down while some were not being utilized at all. This would suggest a problem with the operating system, likely NUMA related. *) Are you counting hyperthreading to get to the 120 cpu count *) Is this server virtualized *) what is the output of: lscpu | grep NUMA *) do you have 'taskset' installed? Can we check affinity via: taskset -c -p <pid> where <pid> is the pid of a few randomly sampled postgres processes at work *) Can you report exact kernel version *) what is output of: cat /sys/kernel/mm/transparent_hugepage/enabled cat /sys/kernel/mm/transparent_hugepage/defrag *) Is installing a newer postgres an option? Configuring highly SMP systems for reliable scaling may require some progressive thinking. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance