On 9/23/16 12:59 PM, phb07 wrote:
Le 21/09/2016 à 23:42, Jim Nasby a écrit :
On 9/12/16 1:05 PM, phb07 wrote:
The drawback is the overhead of this added ANALYZE statement. With a
heavy processing like in this test case, it is worth to be done. But for
common cases, it's a little bit expensive.
You could always look at the number of rows affected by a command and
make a decision on whether to ANALYZE based on that, possibly by
looking at pg_stat_all_tables.n_mod_since_analyze.
I have solved the issue by adding an ANALYZE between both statements. To
avoid the associated overhead for cases when it is not worth to be done,
the ANALYZE is only performed when more than 1000 rows have just been
deleted by the first statement (as the logic is embeded into a plpgsql
function, the GET DIAGNOSTICS statement provides the information). This
threshold is approximately the point where the potential loss due to bad
estimates equals the ANALYZE cost.
But the idea of using the n_mod_since_analyze data to also take into
account other recent updates not yet reflected into the statistics is
very interesting.
Another interesting possibility would be to look at
pg_catalog.pg_stat_xact_all_tables; if you add n_tup_ins, _upd, and _del
that will tell you how much n_mod_since_analyze will be increased when
your transaction commits, so you could guage exactly how much the
current transaction has changed things.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance