Search Postgresql Archives

Re: Problem with a Query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Aggressive autoanalyze and autovacuum settings solve most query problems.  These are my settings:
default_statistics_target = 5000
autovacuum_vacuum_scale_factor = 0.015
autovacuum_vacuum_threshold = 250
autovacuum_analyze_scale_factor = 0.015
autovacuum_analyze_threshold = 250

Such a high default_statistics_target value is controversial, but works for our databases, and resetting it to 100 doesn't noticably speed up slow parse/optimize on queries that take a long time to parse/optimize any more than the 5000 value.

On Mon, Aug 26, 2024 at 6:30 AM Siraj G <tosiraj.g@xxxxxxxxx> wrote:
Thanks Tom. Collecting full stats on the tables involved corrected the execution.

On Tue, Aug 13, 2024 at 9:57 AM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Siraj G <tosiraj.g@xxxxxxxxx> writes:
> We migrated a PgSQL database from Cloud SQL to compute engine and since
> then there is a SQL we observed taking a long time. After some study, I
> found that the SQL is using NESTED LOOP where the cost is too high.

The core of your problem seems to be here:

>                      ->  Index Scan using marketing_a_cancel__55ffff_idx on
> marketing_app_leadhistory w0  (cost=0.57..4274.30 rows=1 width=8) (actual
> time=46.678..51.232 rows=44 loops=1)
>                            Index Cond: ((cancel_event_id IS NOT NULL) AND
> (cancel_event_type = 1))
>                            Filter: ((status_id = 93) AND
> ((followup_date)::date >= '2024-08-01'::date) AND ((followup_date)::date <=
> '2024-08-07'::date))
>                            Rows Removed by Filter: 22268
>                            Buffers: shared hit=9170 read=19

If the planner had estimated 40-some rows out of this step, rather
than one, it would certainly not have chosen to use nestloop joins
atop this.  So the big problem to focus on is making that estimate
better.

A secondary problem is that the choice of index seems poor: the
index itself is selecting 44+22268 = 22312 rows and then the filter
condition is throwing away 99.8% of those rows.  Probably, using
an index on (status_id, followup_date) would have worked better.

I suspect that both of these things are tied to the non-normalization
of your "cancel" condition.  The planner probably believes that
"cancel_event_id IS NOT NULL" is statistically independent of
"cancel_event_type = 1"; but I'll bet it isn't, and thus the index
condition selects many more rows than the planner guessed.  You might
be able to improve that estimate by creating extended stats on both of
those columns, but really a better idea would be to take a step back
and figure out if those two columns can't be merged into one.

                        regards, tom lane


--
Death to America, and butter sauce.
Iraq lobster!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux