Duplicate Extended Statistics

Ilia Evdokimov <ilya.evdokimov@xxxxxxxxxxxxxx> · Tue, 3 Sep 2024 16:09:50 +0300



    Hello everyone,
    I have a question regarding extended statistics in PostgreSQL.
      Why is it possible to create duplicate extended statistics? To
      make it clearer, here’s an example:
    CREATE TABLE t(a int,
      b int);

      INSERT INTO t(a, b) VALUES (...);

      CREATE STATISTICS ON a, b FROM
      t;

      ANALYZE t;

      ....
      

      CREATE STATISTICS ON a, b FROM
      t;

      ANALYZE t;
    After executing these queries, the following issues might arise:
    
      ANALYZE will take longer to run because, for example, MCV
        extended statistics would need to be gathered twice.
      Duplicate information will be stored.
      The planner might take longer to find the relevant statistics
        since it has to search through them in a loop.
    
    Or do duplicate extended statistics practically never occur in
      practice?
    Thanks in advance for your response.
    
    
    -- 
Regards,
Ilia Evdokimov,
Tantor Labs LCC.