Re: High SYS CPU - need advise

Merlin Moncure <mmoncure@xxxxxxxxx> · Fri, 16 Nov 2012 09:28:32 -0600

On Thu, Nov 15, 2012 at 6:07 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
> On Thu, Nov 15, 2012 at 2:44 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
>
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 9000})  = 0 (Timeout)
>
> This is not entirely inconsistent with the spinlock.  Note that 1000
> is repeated 3 times, and 5000 is missing.
>
> This might just be a misleading random sample we got here.  I've seen
> similar close spacing in some simulations I've run.
>
> It is not clear to me why we use a resolution of 1 msec here.  If the
> OS's implementation of select() eventually rounds to the nearest msec,
> that is its business.  But why do we have to lose intermediate
> precision due to its decision?

Yeah -- you're right, this is definitely spinlock issue.  Next steps:

*) in mostly read workloads, we have a couple of known frequent
offenders.  In particular the 'BufFreelistLock'.  One way we can
influence that guy is to try and significantly lower/raise shared
buffers.  So this is one thing to try.

*) failing that, LWLOCK_STATS macro can be compiled in to give us some
information about the particular lock(s) we're binding on.  Hopefully
it's a lwlock -- this will make diagnosing the problem easier.

*) if we're not blocking on lwlock, it's possibly a buffer pin related
issue? I've seen this before, for example on an index scan that is
dependent on an seq scan.  This long thread:
"http://postgresql.1045698.n5.nabble.com/9-2beta1-parallel-queries-ReleasePredicateLocks-CheckForSerializableConflictIn-in-the-oprofile-td5709812i100.html";
has a lot information about that case and deserves a review.

*) we can consider experimenting with futex
(http://archives.postgresql.org/pgsql-hackers/2012-06/msg01588.php)
to see if things improve.  This is dangerous, and could crash your
server/eat your data, so fair warning.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general