Re: Duplicate Extended Statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ilia,

In PostgreSQL, it is possible to create duplicate extended statistics because the system does not enforce uniqueness on statistics definitions. However, this is generally not recommended, as it leads to longer ANALYZE times, increased storage usage, potential planner performance impact, and unnecessary complexity. In practice, duplicates are rare because users and tools usually avoid redundancy, as there is no added benefit to having multiple identical sets of statistics on the same columns.

Regards,
Asad Ali

On Tue, Sep 3, 2024 at 6:10 PM Ilia Evdokimov <ilya.evdokimov@xxxxxxxxxxxxxx> wrote:

Hello everyone,

I have a question regarding extended statistics in PostgreSQL. Why is it possible to create duplicate extended statistics? To make it clearer, here’s an example:

CREATE TABLE t(a int, b int);
INSERT INTO t(a, b) VALUES (...);
CREATE STATISTICS ON a, b FROM t;
ANALYZE t;
....
CREATE STATISTICS ON a, b FROM t;
ANALYZE t;

After executing these queries, the following issues might arise:

  1. ANALYZE will take longer to run because, for example, MCV extended statistics would need to be gathered twice.
  2. Duplicate information will be stored.
  3. The planner might take longer to find the relevant statistics since it has to search through them in a loop.

Or do duplicate extended statistics practically never occur in practice?

Thanks in advance for your response.

-- 
Regards,
Ilia Evdokimov,
Tantor Labs LCC.

[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux