Re: Sort performance on large tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Charlie, 

> Should I expect results like this?  I realize that the 
> computer is quite low-end and is very IO bound for this 
> query, but I'm still surprised that the sort operation takes so long.

It's the sort performance of Postgres that's your problem.
 
> Out of curiosity, I setup an Oracle database on the same 
> machine with the same data and ran the same query.  Oracle 
> was over an order of magnitude faster. Looking at its query 
> plan, it avoided the sort by using "HASH GROUP BY."  Does 
> such a construct exist in PostgreSQL (I see only hash joins)?

Yes, hashaggregate does a similar thing.  You can force the planner to
do it, don't remember off the top of my head but someone else on-list
will.
 
> Also as an experiment I forced oracle to do a sort by running 
> this query:
> 
> SELECT tlid, min(ogc_fid)
> FROM completechain
> GROUP BY tlid
> ORDER BY tlid;
> 
> Even with this, it was more than a magnitude faster than Postgresql. 
> Which makes me think I have somehow misconfigured postgresql 
> (see the relevant parts of postgresql.conf below).

Just as we find with a similar comparison (with a "popular commercial,
proprietary database" :-) Though some might suggest you increase
work_mem or other tuning suggestions to speed sorting, none work.  In
fact, we find that increasing work_mem actually slows sorting slightly.

We are commissioning an improved sorting routine for bizgres
(www.bizgres.org) which will be contributed to the postgres main, but
won't come out at least until 8.2 comes out, possibly 12 mos.  In the
meantime, you will be able to use the new routine in the bizgres version
of postgres, possibly in the next couple of months.

Also - we (Greenplum) are about to announce the public beta of the
bizgres MPP database, which will use all of your CPUs, and those of
other nodes in a cluster, for sorting.  We see a linear scaling of sort
performance, so you could add CPUs and/or hosts and scale out of the
problem.

Cheers,

- Luke


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux