Re:Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

gzh <gzhcoder@xxxxxxx> · Tue, 11 Oct 2022 12:05:16 +0800 (CST)

Hi, Pavel
Thank you for your reply.

> the LIMIT clause is in this case totally useless and messy, and maybe can
> negative impacts optimizer
Yes. After removing the LIMIT clause, the performance is improved. 
The execution plan shows that the index worked.
We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause.

At 2022-10-11 11:32:48, "Pavel Stehule" <pavel.stehule@xxxxxxxxx> wrote:

út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder@xxxxxxx> napsal:
Hi, Tom
Thank you for your reply.

> When you're asking for help, please don't give us vague statements
> like "doesn't seem to work". 
I understand.

> Did the plan (including rowcount
> estimates) change at all?  To what?  How far off is that rowcount
> estimate, anyway --- that is, how many rows actually have cseid = 94?
Please refer to the new execution plan (PostgreSQL 12.11) below.

new=# show enable_seqscan;
 enable_seqscan
----------------
 on
(1 行)

new=# select count(*) from analyze_word_reports;
  count   
----------
 21331980
(1 行)

new=# select count(*) from analyze_word_reports where (cseid = 94);
  count
---------
 1287156
(1 行)

new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1;
                                                                                                 QUERY PLAN                       

the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer

Regards

Pavel

----------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------
 Limit  (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1)
   ->  Finalize Aggregate  (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1)
         ->  Gather  (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Partial Aggregate  (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3)
                     ->  Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports  (cost=0.56..6290
9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3)
                           Index Cond: (cseid = 94)
                           Heap Fetches: 1287156  Planning Time: 0.122 ms  Execution Time: 133.069 ms
(11 行)

new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;
                                                             QUERY PLAN                                                           

----------------------------------------------------------------------------------------------------------------------------------
---
 Limit  (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1)
   ->  Seq Scan on analyze_word_reports  (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=
1)
         Filter: (cseid = 94)
         Rows Removed by Filter: 18320180  Planning Time: 0.086 ms  Execution Time: 2156.985 ms
(6 行)

> If the estimate is far off, then increasing the table's statistics
> target might help.
Thank you for your advice. 
Please tell me how to set the table's statistics up to improve performance.

new=#  select oid from pg_class where relname = 'analyze_word_reports';
  oid  
-------
16429
(1 行)

new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid';
attrelid | attname | attstattarget 
----------+---------+---------------
    16429 | cseid   |            -1
(1 行)

> Another thing that would be worth checking is whether
> "set enable_seqscan = off" prods it to choose the plan you want.
> If not, then there's something else going on besides poor estimates.
"set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4.
The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries?

At 2022-10-10 10:45:54, "Tom Lane" <tgl@xxxxxxxxxxxxx> wrote:
>gzh  <gzhcoder@xxxxxxx> writes:
>> I've run analyze(not vacuum analyze), but it doesn't seem to work.
>
>When you're asking for help, please don't give us vague statements
>like "doesn't seem to work".  Did the plan (including rowcount
>estimates) change at all?  To what?  How far off is that rowcount
>estimate, anyway --- that is, how many rows actually have cseid = 94?
>
>If the estimate is far off, then increasing the table's statistics
>target might help.
>
>Another thing that would be worth checking is whether
>"set enable_seqscan = off" prods it to choose the plan you want.
>If not, then there's something else going on besides poor estimates.
>
>			regards, tom lane