On 21 February 2013 16:23, Sergey Konoplev <gray.ru@xxxxxxxxx> wrote:
On Thu, Feb 21, 2013 at 1:59 AM, Mark Smith <smithmark662@xxxxxxxxx> wrote:> Software: SLES 11 SP2 3.0.58-0.6.2-default x86_64, PostgreSQL 9.0.4.[skipped]
It reminds me a transparent huge pages defragmentation issue that was
> Problem: We have been running PostgreSQL 9.0.4 on SLES11 SP1, last kernel in
> use was 2.6.32-43-0.4, performance has always been great. Since updating
> from SLES11 SP1 to SP2 we now experience many database 'stalls' (e.g.
> normally 'instant' queries taking many seconds, any query will be slow, just
> connecting to the database will be slow).
found in recent kernels.
Transparent huge pages defragmentation could lead to unpredictable
database stalls on some Linux kernels. The recommended settings for
this are below.
db1: ~ # echo always > /sys/kernel/mm/transparent_hugepage/enabled
db1: ~ # echo madvise > /sys/kernel/mm/transparent_hugepage/defrag
[skipped]
Sergey - your suggestion to look at transparent huge pages (THP) has resolved the issue for us, thank you so much. We had noticed abnormally high system CPU usage but didn't get much beyond that in our analysis.
We disabled THP altogether and it was quite simply as if we had turned the 'poor performance' tap off. Since then we have had no slow queries / stalls at all and system CPU is consistently very low. We changed many things whilst trying to resolve this issue but the THP change was done in isolation and we can therefore be confident that in our environment, leaving THP enabled with the default parameters is a killer.
At a later point we will experiment with enabling THP with the recommended madvise defrag setting.
Thank you to all who responded.
Mark