Re: backend suddenly becomes slow, then remains slow

Andrew Dunstan <andrew.dunstan@xxxxxxxxxxxxx> · Thu, 27 Dec 2012 12:43:31 -0500

On 12/26/2012 11:03 PM, Jeff Janes wrote:
On Fri, Dec 14, 2012 at 10:40 AM, Andrew Dunstan 
<andrew.dunstan@xxxxxxxxxxxxx> wrote:
> One of my clients has an odd problem. Every so often a backend will 
suddenly
> become very slow. The odd thing is that once this has happened it 
remains
> slowed down, for all subsequent queries. Zone reclaim is off. There 
is no IO
> or CPU spike, no checkpoint issues or stats timeouts, no other 
symptom that
> we can see.

By "no spike", do you mean that the system as a whole is not using an 
unusual amount of IO or CPU, or that this specific slow back-end is 
not using an unusual amount?

both, really.

Could you strace is and see what it is doing?

Not very easily, because it's a pool connection and we've lowered the 
pool session lifetime as part of the amelioration :-) So it's not 
happening very much any more.

> The problem was a lot worse that it is now, but two steps have
> alleviated it mostly, but not completely: much less aggressive 
autovacuuming
> and reducing the maximum lifetime of backends in the connection 
pooler to 30
> minutes.

Do you have a huge number of tables?  Maybe over the course of a 
long-lived connection, it touches enough tables to bloat the relcache 
/ syscache.  I don't know how the autovac would be involved in that, 
though.

Yes, we do indeed have a huge number of tables. This seems a plausible 
thesis.

cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance