On Tue, 2023-08-01 at 09:47 -0400, richard coleman wrote: > In PostgreSQL foreign tables are not automatically analyzed and analyze must be > specifically called on each table. In the case of sharing tables between > PostgreSQL clusters, there is the use_remote_estimate foreign server option. > > In some of our multi terabyte databases, manually running analyze on all of > the foreign tables can take more than a day. This is per database containing > the foreign schema. Since we have certain large schema that we have centrally > located and share to all of our other database clusters, this really adds up. > > use_remote_estimate isn't really a solution as it adds way too much overhead > and processing time to every query run. > > Since these tables are being continuously analyzed in the database that hosts > the data, is there some way that they statistics could be easily passed through > the foreign server mechanism to the remote database that's calling the query? > > Unless I'm missing something we can either: > 1. manually run analyze on each foreign table in each database that points to > the host table > 2. set use_remote_estimate = true which will cause PostgreSQL to re-obtain > statistics on a per query basis. > > What I am hoping for is either: > 1. pass through the results of analyze from the source database to the one > where the foreign query is being run > 2. add the ability to automatically run analyze on foreign tables just as they > are currently run on local tables. > > Of the two, #1 would seem to be the easiest and least wasteful of resources. Unfortunately, both your wishes don't look feasible: - Transferring table statistics would mean that PostgreSQL understands statistics from other server versions. This is complicated, and we have decided not to do this for pg_upgrade, so I don't think we'll try to do it here. - Autoanalyzing foreign tables would mean that we have some idea how much data has changed on the remote server. How should we do that? What I can imagine is that instead of reading the complete remote table during ANALYZE, PostgreSQL applies TABLESAMPLE to fetch only part. That could be a workable enhancement. Yours, Laurenz Albe