On Tue, Nov 24, 2015 at 9:57 PM, 657985552@xxxxxx <657985552@xxxxxx> wrote: > oh .thanks i understand . but i still have a question . > [root@pg1 pgdata]# uname -a > Linux pg1 3.10.0-123.9.3.el7.x86_64 #1 SMP Thu Nov 6 15:06:03 UTC 2014 > x86_64 x86_64 x86_64 GNU/Linux > [root@pg1 pgdata]# cat /etc/redhat-release > CentOS Linux release 7.0.1406 (Core) > > my os is centos7 . is there THP problem in it ? Yes. The settings posted above (*/transparent_hugepage/*) are the smoking gun. I've had the exact same problem as you; suddenly the database slows down to zero and just as suddenly goes back to normal. What is happening here is that the operating system put in some "optimizations" to help systems manage large amounts of memory (typical server memory configurations have gone up by several orders of magnitude since the 4kb page size was chosen). These optimizations do not play well with postgres memory access patterns; the operating system is forced to defragment memory at random intervals which slows down memory accesss causing spinlock problems. Basically postgres and the kernel get into a very high speed argument over memory access. Lowering shared buffers to around 2GB also provides relief. This suggests that clock sweep is a contributor to the problem, in particular it's maintenance of usage_count (the maintenance of which IIRC is changing soon to pure atomic update) would be a place to start sniffing around if we wanted to Really Fix It. So far though, no one has been able to reproduce this issue in a non production system. I guess if we were using (non portable) futexes instead of hand written spinlocks we'd probably have less problems in this area. Nevertheless given the huge performance risks I really wonder what RedHat was thinking when they enabled it by default. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general