Re: analyze foreign tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote:
> In PostgreSQL foreign tables are not automatically analyzed and analyze must be
> specifically called on each table.  In the case of sharing tables between
> PostgreSQL clusters, there is the use_remote_estimate foreign server option.
> 
> In some of our multi terabyte databases, manually running analyze on all of
> the foreign tables can take more than a day.  This is per database containing
> the foreign schema.  Since we have certain large schema that we have centrally
> located and share to all of our other database clusters, this really adds up.
> 
> use_remote_estimate isn't really a solution as it adds way too much overhead
> and processing time to every query run.
> 
> Since these tables are being continuously analyzed in the database that hosts
> the data, is there some way that they statistics could be easily passed through
> the foreign server mechanism to the remote database that's calling the query?
> 
> Unless I'm missing something we can either:
> 1. manually run analyze on each foreign table in each database that points to
>    the host table
> 2. set use_remote_estimate = true which will cause PostgreSQL to re-obtain
>    statistics on a per query basis.
> 
> What I am hoping for is either:
> 1. pass through the results of analyze from the source database to the one
>    where the foreign query is being run
> 2. add the ability to automatically run analyze on foreign tables just as they
>    are currently run on local tables.
> 
> Of the two, #1 would seem to be the easiest and least wasteful of resources.

Unfortunately, both your wishes don't look feasible:

- Transferring table statistics would mean that PostgreSQL understands statistics
  from other server versions.  This is complicated, and we have decided not to
  do this for pg_upgrade, so I don't think we'll try to do it here.

- Autoanalyzing foreign tables would mean that we have some idea how much data
  has changed on the remote server.  How should we do that?

What I can imagine is that instead of reading the complete remote table during
ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part.  That could be a
workable enhancement.

Yours,
Laurenz Albe






[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux