Re: Slow planning time

"Scott Neville" <scott.neville@xxxxxxxxxxxxxxxxxxxxxxx> · Mon, 28 Dec 2015 11:07:33 +0000

Hi,

To clarify, the slow planning time was on all nodes until I restarted the 
slave nodes the slave nodes are now performing better.  The master however 
is still performing slowly.

I can increase the number of auto-vacuum workers but this cluster only has 
about 25 tables in it (as I mentioned before some have quite a lot of rows). 
So to my mind it appeared to be better to have one auto-vacuum process 
getting a whole table done as fast as possible and limiting the impact by 
only having one.  You are also correct the tool we have deliberately only 
runs vacuum analyze on user tables.  It was assumed that this would be 
enough to keep user tables tidy (and to try and target the vacuum of these 
tables at lower usage times) and auto-vacuum in the database could take care 
of the rest.  Is this not correct?

Thanks

Scott

On Wed, 23 Dec 2015 10:24:24 -0500
 Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Scott Neville <scott.neville@xxxxxxxxxxxxxxxxxxxxxxx> writes:
We have a database that for some reason has started to be really 
slow at planning all queries.  The database is running version 9.4.2 
since July 28th (it was freshly installed then - compiled from 
source).  The response time is fairly sporadic, but the quickest plan 
time I have seen (on any query) using explain analyze is 39ms with an 
execution time of 1ms, however we have slow query logging on and we 
are getting queries taking over 6000 ms in the planning stage with 
then only a few ms to execute.  There is nothing complex about the 
queries so even something like this:
select max(datetime) from audit;
(where datetime is an indexed field takes 200ms to plan and 0.5ms to 
execute).

The databases are involved in a replication chain so I have
M1 -> S1 -> S2
I have restarted S2 and S1 and this appears to have made the problem 
go away (though for how long....).  S1 has a replication slot listed 
on M1.

Please clarify: the slowness occurs on the slaves but not the 
master?

I am suspicious that the problem has to do with bloat in 
pg_statistic,
which I will bet that your homegrown vacuuming protocol isn't 
covering
adequately.  I concur with Kevin's nearby advice that you'd be 
better
off to forget that and use 10 or so autovacuum workers; you can use
autovacuum_cost_limit to throttle their I/O impact, and still be a
lot better off than with just 1 worker.

There is probably something else going on that's 
replication-specific,
but I'm not sufficiently up on that aspect of things to theorize.

			regards, tom lane

---
Scott Neville
Software Developer, Bluestar Software
Telephone: +44 (0)1256 882695
Web site: www.bluestar-software.co.uk
Facebook: www.facebook.com/bluestarsoftware
Email: scott.neville@xxxxxxxxxxxxxxxxxxxxxxx

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DISCLAIMER: This email message and any attachments is for the sole
use of the intended recipient(s) and may contain confidential and
privileged information.  Any unauthorised review, use, disclosure
or distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.

The views expressed in this message may not necessarily reflect the
views of Bluestar Software Ltd.

Bluestar Software Ltd, Registered in England
Company Registration No. 03537860, VAT No. 709 2751 29
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin