On Fri, 4 Feb 2011, Vitalii Tymchyshyn wrote:
04.02.11 16:33, Kenneth Marshall ???????(??):
In addition, the streaming ANALYZE can provide better statistics at
any time during the load and it will be complete immediately. As far
as passing the entire table through the ANALYZE process, a simple
counter can be used to only send the required samples based on the
statistics target. Where this would seem to help the most is in
temporary tables which currently do not work with autovacuum but it
would streamline their use for more complicated queries that need
an analyze to perform well.
Actually for me the main "con" with streaming analyze is that it adds
significant CPU burden to already not too fast load process. Especially if
it's automatically done for any load operation performed (and I can't see how
it can be enabled on some threshold).
two thoughts
1. if it's a large enough load, itsn't it I/O bound?
2. this chould be done in a separate process/thread than the load itself,
that way the overhead of the load is just copying the data in memory to
the other process.
with a multi-threaded load, this would eat up some cpu that could be used
for the load, but cores/chip are still climbing rapidly so I expect that
it's still pretty easy to end up with enough CPU to handle the extra load.
David Lang
And you can't start after some threshold of data passed by since you may
loose significant information (like minimal values).
Best regards, Vitalii Tymchyshyn
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance