Re: extremly bad select performance on huge table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tom and Igor,

thanks for your help. With the reindex the select query running time was reduced from 5200 sec to 130 sec. Impressive!

Even a join on this table is now fast.

Unfortunately, there is now another problem: The table in my example has 500 columns which I want to retrieve with my join command.

Example which is fast "select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn"

Example which is slow "select value,c1,c2,c3,...,c10 from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn"


Which is the number of columns to fetch so bad ? Which action is done in the db system when querying this via pgadmin? I think that there is no real retrieval included, why is the number of additional columns so bad for the join performance?

=?ISO-8859-15?Q?Bj=F6rn_Wittich?= <Bjoern_Wittich@xxxxxx> writes:
Here is the explain (analyze,buffers) select mycolumn from myhugetable
"Index Only Scan using myprimkey on myhugetable  (cost=0.00..8224444.82
rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999
loops=1)"
"  Heap Fetches: 356861"
"  Buffers: shared hit=71799472 read=613813"
"Total runtime: 2503009.611 ms"
So that works out to about 4 msec per page fetched considering only I/O
costs, which is about as good as you're likely to get if the data is
sitting on spinning rust.

You could potentially make it faster with a VACUUM (to mark all pages
all-visible and eliminate the "heap fetches" costs), or a REINDEX
(so that the index scan becomes more nearly sequential instead of random
access).  However, unless the data is nearly static those will just be
temporary fixes: the time will degrade again as you update the table.

Note: This select is just for testing. My final statement will be a join
on this table via the "mycolumn" column.
In that case it's probably a waste of time to worry about the performance
of this query as such.  In the first place, a join is not likely to use
the index at all unless it's fetching a relatively small number of rows,
and in the second place it seems unlikely that the join query can use
an IndexOnlyScan on this index --- I imagine that the purpose of the join
will require fetching additional columns.

			regards, tom lane





--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux