Re: High SYS CPU - need advise

Merlin Moncure <mmoncure@xxxxxxxxx> · Mon, 19 Nov 2012 08:33:22 -0600

On Sun, Nov 18, 2012 at 4:24 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
> On Fri, Nov 16, 2012 at 12:13 PM, Vlad <marchenko@xxxxxxxxx> wrote:
>> ok, I've applied that patch and ran. The stall started around 13:50:45...50
>> and lasted until the end
>>
>> https://dl.dropbox.com/u/109778/postgresql-2012-11-16_134904-stripped.log
>
> That isn't as much log as I expected.  But I guess only the tip of the
> iceberg gets logged, it could be that most spinlocks are contended but
> then get acquired just before the sleep (and log) would have happened.
>
> I'm not sure how to figure out just how much spinning is going on
> below the surface, but my gut feeling is there isn't enough of it to
> explain the amount of slowdown you are seeing (and it would probably
> be accounted for as user time rather than system time)

yes. classic spinlock contention symptoms are high user cpu and almost
work getting done.   something else is going on here?

> Maybe we could whip up a dummy standalone  program that uses PG's
> spinlock code in a intentionally contended  way and see how your
> system reports resource usage for that.  Maybe your system reports
> fighting for cache lines or whatever is going on behind the scenes as
> system time rather than user time, or something.

I don't think this is the case.  It's looking more and more like
scheduler issues, although it's unclear if that is due to some kind of
regression or just symptoms of backends piling up.  I'm also starting
to wonder if the spinlock contention we are seeing (which according to
your patch really isn't all that much really) is just symptomatic of
scheduler issues.

> It would be a little easier to figure out what was going on there if
> the log included the pid (%p) so we can see if it is the same process
> doing a bunch of consecutive sleeps, or a bunch of different
> processes.  Also, logging the millisecond (%m rather than %t) would be
> helpful.
>
> But the way that the point of contention jumps around a lot points to
> a scheduler issue rather than a postgres-specific issue.  A backend
> gets pre-empted while holding a spinlock, and then the kernel is
> resistant to rescheduling it even though all other processes are piled
> up behind it.  But again, that theory would lead to more log lines
> than we see, I would think.

yeah.  I asked for some longer strace logs but then everyone went home
for the weekend.

> Anyway, pgbouncer might help by ensuring that a connection gets
> pre-empted by pgbouncer at a point where it has no locks, rather than
> by the kernel at a point where it does hold locks.

yes -- I think pgbouncer is the key here because it's going to tell us
if the scheduler is the culprit here.  If you look carefully at the
strace logs, you'll see that system call times become unpredictable as
they are getting scheduled out...particularly yielding calls like
select().  So we need to determine if this is normal high load
behavior or if there is an underlying mechanic.

> Any chance you could try an older kernel and see if that reduces the problem?

This is one thing to try.  Personally I think pgbouncer/transaction
mode is low hanging fruit if it 'drops in' to OP's app. If problem
goes away this will pretty much point the finger at Linux AFAICT (in
which case we can start messing around with the kernel).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general