Search Postgresql Archives

Re: Auto-analyse on insert operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Le 04/11/2015 16:57, Adrian Klaver a écrit :
On 11/04/2015 07:43 AM, Bertrand Roos wrote:


Le 04/11/2015 14:55, Bill Moran a écrit :
On Wed, 4 Nov 2015 14:32:37 +0100
Bertrand Roos <bertrand.roos@xxxxxxxx> wrote:
I try to configure auto-analyse task with postgresql 9.4.
I have the following configuration (default configuration):
track_counts = on
autovacuum = on
log_autovacuum_min_duration = -1
autovacuum_max_workers = 3
autovacuum_naptime = 300s
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.2
autovacuum_freeze_max_age = 200000000
autovacuum_multixact_freeze_max_age = 400000000
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = -1

With this configuration, I can observe that some tables are
auto-analysed, but some others are not. Even if there are millions of
insert operations on an empty table (all tables are in cluster mode).
In fact it seems that tables with update operations are the only ones
that are auto-analysed.
I'm quite suprised because the documentation says that daemon check the
count of insert, update and delete operations.
What could it be the reason ? Why tables which have only update
operation, aren't analysed ?
Are update operations really taken into account ?
Given that autoanalyze is pretty critical to the way the system
functions,
it's unlikely that it just doesn't work (someone else would have
noticed).

A more likely scenario is that you've found some extremely obscure edge
case. If that's the case, you're going to have to give very specific
details as to how you're testing it before anyone is liable to be able
to help you.

I get the impression that you're somewhat new to Postgres, in which case
it's very likely that the problem is that you're not testing the
situation
correctly. In that case, we're going to need specific details on how
you're
observing that tables are or are not being analysed.

As a wild-guess theory: the process that does the analyze only wakes up
to check tables every 5 minutes (based on the config you show) ... so are you doing the inserts then checking the table without leaving enough time
in between for the system to wake up and notice the change?

Thanks for your answer Bill.
Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing
something wrong. But I did my test on a more than 1 day duration, so
it's not an issue of autovacuum_naptime (I insert 760 lignes each 30
seconds during 36 hours).
I can't give all the details of this test because it is to complicated
with triggers and partman (and your objective is not to solve
configuration issues of others).

In fact, I was telling the question because I have read on some forums
that the auto vacuum deamon only count dead tuple so only update and
delete operations can cause the scheduling of auto-analyse.
So if it's the case it perfectly explain why my test case doesn't work.
But in the other hand the documentation says that delete, update and
insert operations are counted.
Is it an know issue that insert operations are not counted for the
trigger of auto-analyse ?

No, see below:

http://www.postgresql.org/docs/9.4/interactive/routine-vacuuming.html#AUTOVACUUM

23.1.3. Updating Planner Statistics

"The autovacuum daemon, if enabled, will automatically issue ANALYZE commands whenever the content of a table has changed sufficiently. However, administrators might prefer to rely on manually-scheduled ANALYZE operations, particularly if it is known that update activity on a table will not affect the statistics of "interesting" columns. The daemon schedules ANALYZE strictly as a function of the number of rows inserted or updated; it has no knowledge of whether that will lead to meaningful statistical changes."


Is the partman you refer to this?:

https://github.com/keithf4/pg_partman

Can you give an outline view of what you are doing and how you are determining the status of analyze?


If it's not, I can try to reproduce this weird behaviour with a simpler
test and give you all the details of the test.

Bertrand






I have done multiple tests trying to reproduce my issue and finally find a partial explanation.

My non-working use case is :
With a database with configuration by default (autovacuum=on, autovacuum_max_workers=3 ...) A table ('table_name') partioned with partman with a child table per month (@Adrian Klaver, yes it's https://github.com/keithf4/pg_partman)
A remote deamon that calls an analyze every day on the table 'table_name'
A remote application that calls multiple stored procedures doing an insert operation on 'table_name' (785 inserts each 30s), with no delete, no update operation This stored procedures are sometimes blocked by a short lock (they are doing some update operations on a cache table of 'table_name')

It happened that :
The current child table has no statistics :
- SELECT * FROM pg_stat WHERE tablename = '$table_name_month'; => empty row
- SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name_month'; => statistics are good (coherent with values I should have) but I saw no value for last_autoanalyze or last_analyze (even if the test was running for a few days and the table start with no data and grow to 6 million rows)

The behaviour of the auto-analyze I have observed :
Insert operations on a table trigger auto-analyze (with track_counts=on).
Stored procedure on a table trigger auto-analyze (even if track_functions=off)
Insert operations on a partitioned table trigger auto-analyze.
Insert operations on a daily analyze table may not trigger auto-analyze, because an analyze reset the count of operation for auto-analyze. An analyze on a partitioned table update statistics of the parent table but not for children tables. An analyze on a partitioned table do not reset the count of operations for auto-analyze of children tables.

I may not be able yet to explain the reason why the auto-analyze is rarely not triggered.
But at least I found a way to fix pg_stat of the current table.

@Bill Moran, Are you sure that a dump of the table could help you ? The values of the table are correct, the main issue is that statistics are becoming invalid and so postgres is choosing dummy request paths.

Bertrand



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux