update on my problem: despite pgbouncer, the problem still occures on my
end.
Also, interesting observation - I ran several tests with pgbench, using
queries that I think are prone to trigger high-sys-cpu-stall. What I
noticed is when pgbench is started with prepared mode, the system
behaves fine during stress-test (high user cpu - 85-90%, low sys cpu -
5-7%), high TPS. Though when I used non-prepared modes, the sys cpu
portion jumps to 40% (and tps drops dramatically as well, but this is
understandable). The test queries are pretty long (10kb+), with couple
of outer joins across 1000-record tables with indexes.
Maybe, we are looking in a wrong place and the issue is somewhere within
planer/parser? Is there some extensive locking used in there?
Another observation - it's harder to trigger high-sys-cpu stall on a
freshly restarted postgresql. Though if it was running for a while, then
it's much easier to do.
-- vlad
On 11/19/12 8:33 AM, Merlin Moncure wrote:
On Sun, Nov 18, 2012 at 4:24 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
On Fri, Nov 16, 2012 at 12:13 PM, Vlad <marchenko@xxxxxxxxx> wrote:
ok, I've applied that patch and ran. The stall started around 13:50:45...50
and lasted until the end
https://dl.dropbox.com/u/109778/postgresql-2012-11-16_134904-stripped.log
That isn't as much log as I expected. But I guess only the tip of the
iceberg gets logged, it could be that most spinlocks are contended but
then get acquired just before the sleep (and log) would have happened.
I'm not sure how to figure out just how much spinning is going on
below the surface, but my gut feeling is there isn't enough of it to
explain the amount of slowdown you are seeing (and it would probably
be accounted for as user time rather than system time)
yes. classic spinlock contention symptoms are high user cpu and almost
work getting done. something else is going on here?
Maybe we could whip up a dummy standalone program that uses PG's
spinlock code in a intentionally contended way and see how your
system reports resource usage for that. Maybe your system reports
fighting for cache lines or whatever is going on behind the scenes as
system time rather than user time, or something.
I don't think this is the case. It's looking more and more like
scheduler issues, although it's unclear if that is due to some kind of
regression or just symptoms of backends piling up. I'm also starting
to wonder if the spinlock contention we are seeing (which according to
your patch really isn't all that much really) is just symptomatic of
scheduler issues.
It would be a little easier to figure out what was going on there if
the log included the pid (%p) so we can see if it is the same process
doing a bunch of consecutive sleeps, or a bunch of different
processes. Also, logging the millisecond (%m rather than %t) would be
helpful.
But the way that the point of contention jumps around a lot points to
a scheduler issue rather than a postgres-specific issue. A backend
gets pre-empted while holding a spinlock, and then the kernel is
resistant to rescheduling it even though all other processes are piled
up behind it. But again, that theory would lead to more log lines
than we see, I would think.
yeah. I asked for some longer strace logs but then everyone went home
for the weekend.
Anyway, pgbouncer might help by ensuring that a connection gets
pre-empted by pgbouncer at a point where it has no locks, rather than
by the kernel at a point where it does hold locks.
yes -- I think pgbouncer is the key here because it's going to tell us
if the scheduler is the culprit here. If you look carefully at the
strace logs, you'll see that system call times become unpredictable as
they are getting scheduled out...particularly yielding calls like
select(). So we need to determine if this is normal high load
behavior or if there is an underlying mechanic.
Any chance you could try an older kernel and see if that reduces the problem?
This is one thing to try. Personally I think pgbouncer/transaction
mode is low hanging fruit if it 'drops in' to OP's app. If problem
goes away this will pretty much point the finger at Linux AFAICT (in
which case we can start messing around with the kernel).
merlin
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general