Re: Up to date conventional wisdom re max shared_buffer size?

Andres Freund <andres@xxxxxxxxxxx> · Wed, 20 Sep 2017 12:13:26 -0700

On 2017-09-20 13:00:34 -0500, Jerry Sievers wrote:
> >> Pg 9.3 on monster 2T/192 CPU Xenial thrashing
> >
> > Not sure what the word "thrashing" in that sentence means.
> 
> Cases of dozens or hundreds of sessions running typical statements for
> this system but running 100% on their CPUs.  Seems to be triggered by
> certain heavy weight batch jobs kicking off on this generally OLTP
> system.
> 
> ISTM there might be LW lock contention happening around some sort of
> shared resource where the lock wait implementation is a CPU spinner.

Yes, we improved that a lot in 9.5, 9.6 and 10.  The really bad
scenarios - I've seen 95% cpu time spent in locking - should all be
fixed.

I'd try to make sure that both transparent hugepages and zone reclaim
mode are disabled - the latter probably is already, but the former might
still cause some problems.

> > Things have improved a lot since 9.3 WRT to scalability, so I'd not
> > infer too much from 9.3 performance on a larger box.
> 
> Understood.  The situation got worse when we moved to the even bigger
> box also running a 4.x kernel which I presume was no where near existent
> when 9.3 was our current Pg version.

I suspect it's more the bigger box than the newer kernel. The more
sockets and cores you have, the more lock contention bites you. That's
because inter-socket / cpu transfers get more expensive with more cores.

> >> Upgrade pending but we recently started having $interesting performance
> >> issues at times looking like I/O slowness and other times apparently
> >> causing CPU spins.
> >
> > That's not something we can really usefully comment on given the amount
> > of information.
> 
> Ack'd.
> 
> I'd like to strace some of the spinning backends when/if we get another
> opportunity to observe the problem to see if by syscall or libfunc name
> we can learn more about what's the cause.

I think the causes are known, and fixed - don't think there's much you
can do besides upgrading, unless you want to backport a number of
complex patches yourself.

FWIW, usually perf gives better answers than strace in this type of
scenario.

> >> Anyway, shared_buffer coherency generally high but does take big dips
> >> that are sometimes sustained for seconds or even minutes.
> >
> > "shared_buffer coherency"?
> 
> As measured querying pg_stat_databases and comparing total reads to read
> hits.  Run frequently such as once /5-seconds and factored into a hit
> percentage.  May stay up around 100% for several ticks but then go way
> down which may or not sustain.
> 
> This is an OLTP app using Rails with hundreds of tables both trivial
> n structure as well as having partitions, large payloads... TOAST and
> the like.
> 
> TPS can measure in the ~5-10k range.

That's cache hit rate, not coherency ;)

- Andres

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general