Re: same query different execution plan (hash join vs. semi-hash join)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you Tom. But the time spent on scanning table test1 is less than 1 second (91.738 compares to 87.869), so I guess this shouldn't be the issue?


-----Original Message-----
From: Tom Lane [mailto:tgl@xxxxxxxxxxxxx] 
Sent: Friday, May 16, 2014 12:58 PM
To: Huang, Suya
Cc: pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: [PERFORM] same query different execution plan (hash join vs. semi-hash join)

"Huang, Suya" <Suya.Huang@xxxxxxxxxxxxxxx> writes:
> I've got a query as below, it runs several times with different execution plan and totally different execution time. The one using hash-join is slow and the one using semi-hash join is very fast. However, I have no control over the optimizer behavior of PostgreSQL database. Or, do I have?

A salient feature of the slow plan is that the planner is misinformed about the size of test1:

>                      ->  Seq Scan on test1  (cost=0.00..5153.94 
> rows=63294 width=516) (actual time=0.068..91.378 rows=441736 loops=1)

whereas in the fast plan its rows estimate for that scan is dead on.
It looks like the two cases also have different ideas of how many distinct values are in the test1.userid column, though this is more a guess than an indisputable fact.

In short, I suspect you're recreating the test1 table and not bothering to ANALYZE it after you fill it.  This leaves you at the mercy of when the autovacuum daemon gets around to analyzing the table before you'll get good plans for it.

			regards, tom lane



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux