Re: inputs into query planner costing

Tom Lane <tgl@xxxxxxxxxxxxx> · Tue, 31 Mar 2015 14:16:09 -0400

Mike Roest <mike.roest@xxxxxxxxxxxx> writes:
>    I'm having an issue with query performance between 2 different pgsql
> environments.

> Ther first is our current production postgres server with is running 9.3.5
> on Centos 5 x64.  The second system is Amazon's RDS postgres as a service.
> On our local DB server we have a query that executes in a reasonable amount
> of time (600 msec).  On RDS the query will run for more then 10 minutes on
> a similarly CPU specced systems.  I've been working through with Amazon
> support and I'm looking for more suggestions on where to look (both for me
> and to direct Amazon).  The RDS system does use a network filesystem while
> our production server is a local RAID10 array, I can see that effecting the
> actual performance of the query but not the query planner costing (unless
> there's an input to query planner costing that I can't find)

> The Query plan costs generated by the 2 systems are vastly different, while
> the plans themselves are basically identical other then the materialization
> that RDS is doing (if I disable the materialization then they are almost
> the same other then a seq scan/heap scan on one small <2000 row table).
> All the tables in the query have been analyzed on each server without any
> impact

Last I checked, there was not any magic pixie dust in the planner ;-).
Your results have to be explained by one or more of these things:

1. Not same version of Postgres between the two systems.

2. Not same planner parameter settings.

3. Different physical table sizes.

4. Different ANALYZE statistics.

As for #1, I have no idea whether Amazon RDS runs a purely stock Postgres
release or has some custom modifications of their own, but it'd be worth
asking about that.

As for #2, you say you checked that, but I'm dubious.  In particular this
discrepancy:

Index Scan using uix2pdas_userpolicy on policydataaccessscope policydataaccessscope31 (cost=0.28..8.30 rows=1 width=16)

Index Scan using uix2pdas_userpolicy on policydataaccessscope policydataaccessscope31 (cost=0.28..4.30 rows=1 width=16)

is hard to explain unless the second system is using a smaller
random_page_cost than the first.  Maybe somebody used ALTER ROLE SET
or ALTER DATABASE SET to adjust parameters in a way that only affects
some roles/databases?

I suspect that the large differences in some of the seqscan costs might be
explainable by #3, ie those tables are bloated with lots of empty space on
one system but not the other.  Comparing pg_relation_size() would be the
way to find out.

I mention #4 for completeness.  ANALYZE uses random sampling, so it's
expectable that the data distribution stats would be a bit different on
the two systems, but large differences that persist across multiple
ANALYZE attempts are unlikely.  (Although ... you do have the
same default_statistics_target on both systems, no?  Table-specific
statistics targets could be a gotcha as well.)

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general