Re: slow query

Anj Adu <fotographs@xxxxxxxxx> · Fri, 4 Jun 2010 11:21:28 -0700

The behaviour is different in postgres 8.1.9 (much faster)  (the table
has 9 million rows instead of 25 million..but the query comes back
very fast (8 seconds)..

Wonder if this is very specific to 8.4.0

On Fri, Jun 4, 2010 at 11:01 AM, Anj Adu <fotographs@xxxxxxxxx> wrote:
> Does the difference in expected and actual rows as seen by the planner
> a big factor? Even after an analyze...the results are similar. (there
> is a big diff between expected and actual)
> Partition has 25 million rows
>
> On Fri, Jun 4, 2010 at 10:41 AM, Anj Adu <fotographs@xxxxxxxxx> wrote:
>> 2010/6/4  <tv@xxxxxxxx>:
>>>> I am reposting as my original query was mangled
>>>>
>>>> The link to the explain plan is here as it does not paste well into
>>>> the email body.
>>>>
>>>> http://explain.depesz.com/s/kHa
>>>>
>>>>
>>>> The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K
>>>> single raid-10 array
>>>>
>>>> 1G work_mem
>>>> default_statistics_target=1000
>>>> random_page_cost=1
>>>
>>> Are you sure it's wise to set the work_mem to 1G? Do you really need it?
>>> Don't forget this is not a 'total' or 'per query' - each query may
>>> allocate multiple work areas (and occupy multiple GB). But I guess this
>>> does not cause the original problem.
>>>
>>> The last row 'random_page_cost=1' - this basically says that reading data
>>> by random is just as cheap as reading data sequentially. Which may result
>>> in poor performance due to bad plans. Why have you set this value?
>>>
>>> Sure, there are rare cases where 'random_page_cost=1' is OK.
>>
>> The default for 8.4 is 2
>> I tried with 2 and 1..but the results are not very different. I
>> understand that for fast disks (which we have with a decent Raid 10
>> setup)..the random_page_cost can be lowered as needed..but I guess it
>> did not make a difference here.
>>
>>
>>>
>>>>
>>>> I am curious why the hash join takes so long. The main table
>>>> dev4_act_dy_fact_2010_05_t has 25 million rows. The table is
>>>> partitioned into 3 parts per month. Remaining tables are very small (
>>>> < 1000 rows)
>>>
>>> Well, the real cause that makes your query slow is the 'index scan' part.
>>>
>>> Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on
>>> dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60)
>>> (actual time=164533.725..164533.725 rows=0 loops=1)
>>>
>>> The first thing to note here is the difference in expected and actual
>>> number of rows - the planner expects 204276 but gets 0 rows. How large is
>>> this partition?
>>
>> The partition has 25 million rows with indexes on theDate, node_id..
>> I altered the random_page_cost to 4 (1 more than the default)..still
>> slow. These tables are analyzed every day
>> I have an index on each field used in the where criteria,
>>>
>>> Try to analyze it, set the random_page_cost to something reasonable (e.g.
>>> 4) and try to run the query again.
>>>
>>> Tomas
>>>
>>>
>>
>

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance