Thanks for the tip for using perf or oprofile but ntp might not the problem at all. During testing with ntp off the problem was still reproducable be it less frequent. It might have something to do with accessive row locking. We are currently looking into the explain results from the postgresql log if there is a pattern to be observerd and reading the pg_locks chapters from the books ;-). It will take some time to understand whats going on.
I still might need to use the tools to identify where in code the CPU user load comes from.
I'll keep you posted.
Kind regards,
Dennis Brouwer
M4n
On Mon, Sep 24, 2012 at 6:30 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Dennis Brouwer <dennis.brouwer@xxxxxx> writes:That's really bizarre. What "ntp client" are you using exactly? Is it
> Last week I was repeatedly able to run all these tests on the database
> without any issue but recently, all of a sudden at random, some of the
> queries performed a factor 100 less. It may take hours to complete the
> transaction. At the same moment we see a dramatic decrease in IO and the
> CPU is nearly 100% busy in user space.
> After days of testing I may have found the cause: the ntp client. If I stop
> the ntp client the problem vanishes.
> I have started reading on spinlocks and other related material but this all
> is rather complicated stuff and kindly ask in what direction I should
> search. The issue can be reproduced for both postgresql-9.1 and
> postgresql-9.2 and perhaps can be rephrased as: Very high CPU load in user
> space (at random) with ntp enabled and (long?) running transactions.
configured to adjust the system clock by slewing, or by stepping? Can
you identify what part of the code is eating CPU (try perf or oprofile)?
regards, tom lane