Hi,
To clarify, the slow planning time was on all nodes until I restarted the
slave nodes the slave nodes are now performing better. The master however
is still performing slowly.
I can increase the number of auto-vacuum workers but this cluster only has
about 25 tables in it (as I mentioned before some have quite a lot of rows).
So to my mind it appeared to be better to have one auto-vacuum process
getting a whole table done as fast as possible and limiting the impact by
only having one. You are also correct the tool we have deliberately only
runs vacuum analyze on user tables. It was assumed that this would be
enough to keep user tables tidy (and to try and target the vacuum of these
tables at lower usage times) and auto-vacuum in the database could take care
of the rest. Is this not correct?
Thanks
Scott
On Wed, 23 Dec 2015 10:24:24 -0500
Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Scott Neville <scott.neville@xxxxxxxxxxxxxxxxxxxxxxx> writes:
We have a database that for some reason has started to be really
slow at planning all queries. The database is running version 9.4.2
since July 28th (it was freshly installed then - compiled from
source). The response time is fairly sporadic, but the quickest plan
time I have seen (on any query) using explain analyze is 39ms with an
execution time of 1ms, however we have slow query logging on and we
are getting queries taking over 6000 ms in the planning stage with
then only a few ms to execute. There is nothing complex about the
queries so even something like this:
select max(datetime) from audit;
(where datetime is an indexed field takes 200ms to plan and 0.5ms to
execute).
The databases are involved in a replication chain so I have
M1 -> S1 -> S2
I have restarted S2 and S1 and this appears to have made the problem
go away (though for how long....). S1 has a replication slot listed
on M1.
Please clarify: the slowness occurs on the slaves but not the
master?
I am suspicious that the problem has to do with bloat in
pg_statistic,
which I will bet that your homegrown vacuuming protocol isn't
covering
adequately. I concur with Kevin's nearby advice that you'd be
better
off to forget that and use 10 or so autovacuum workers; you can use
autovacuum_cost_limit to throttle their I/O impact, and still be a
lot better off than with just 1 worker.
There is probably something else going on that's
replication-specific,
but I'm not sufficiently up on that aspect of things to theorize.
regards, tom lane
---
Scott Neville
Software Developer, Bluestar Software
Telephone: +44 (0)1256 882695
Web site: www.bluestar-software.co.uk
Facebook: www.facebook.com/bluestarsoftware
Email: scott.neville@xxxxxxxxxxxxxxxxxxxxxxx
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DISCLAIMER: This email message and any attachments is for the sole
use of the intended recipient(s) and may contain confidential and
privileged information. Any unauthorised review, use, disclosure
or distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.
The views expressed in this message may not necessarily reflect the
views of Bluestar Software Ltd.
Bluestar Software Ltd, Registered in England
Company Registration No. 03537860, VAT No. 709 2751 29
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin