backend suddenly becomes slow, then remains slow

Jeff Janes <jeff.janes@xxxxxxxxx> · Wed, 26 Dec 2012 23:03:33 -0500

On Fri, Dec 14, 2012 at 10:40 AM, Andrew Dunstan <andrew.dunstan@xxxxxxxxxxxxx> wrote:

> One of my clients has an odd problem. Every so often a backend will suddenly

> become very slow. The odd thing is that once this has happened it remains

> slowed down, for all subsequent queries. Zone reclaim is off. There is no IO

> or CPU spike, no checkpoint issues or stats timeouts, no other symptom that

> we can see.

By "no spike", do you mean that the system as a whole is not using an unusual amount of IO or CPU, or that this specific slow back-end is not using an unusual amount?

Could you strace is and see what it is doing?

> The problem was a lot worse that it is now, but two steps have

> alleviated it mostly, but not completely: much less aggressive autovacuuming

> and reducing the maximum lifetime of backends in the connection pooler to 30

> minutes.
Do you have a huge number of tables?  Maybe over the course of a long-lived connection, it touches enough tables to bloat the relcache / syscache.  I don't know how the autovac would be involved in that, though.

Cheers,

Jeff