Search Postgresql Archives

Re: Slow query performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kevin Galligan wrote, On 29-10-08 23:35:
An example of a slow query is...

select count(*) from bigdatatable where age between 22 and 40 and state = 'NY';

explain analyze returned the following...

Aggregate (cost=5179639.55..5179639.56 rows=1 width=0) (actual time=389529.895..389529.897 rows=1 loops=1) -> Bitmap Heap Scan on bigdatatable (cost=285410.65..5172649.63 rows=2795968 width=0) (actual time=6727.848..387159.175 rows=2553273 loops=1)
         Recheck Cond: ((state)::text = 'NY'::text)
         Filter: ((age >= 22) AND (age <= 40))
-> Bitmap Index Scan on idx_jstate (cost=0.00..284711.66 rows=15425370 width=0) (actual time=6298.950..6298.950 ro ws=16821828 loops=1)
               Index Cond: ((state)::text = 'NY'::text)
 Total runtime: 389544.088 ms

It looks like the index scans are around 6 seconds or so each, but then the bitmap heap scan and aggregate jump up to 6 mintues.

Indeed. Its cause is that PostGreSQL must traverse the data in order to verify if the data is valid for the transaction. This means A LOT of data must be retrieved from disk.

The only real thing you can do is reduce I/O load, by reducing the amount of data that must be traversed (or ensuring the data is stored closely together, but thats really hard to get right). This requires optimizing your database design for that single goal. This will not make it scale any better than it currently does, however. The query will scale O(N) with the size of your table, you want other techniques to do better.

Another thing is spending extra money on hardware that can sustain higher I/O seek rates (more and/or faster spindles).

- Joris

More detail on the table design and other stuff in a bit...


On Wed, Oct 29, 2008 at 6:18 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx <mailto:scott.marlowe@xxxxxxxxx>> wrote:

    On Wed, Oct 29, 2008 at 2:18 PM, Kevin Galligan <kgalligan@xxxxxxxxx
    <mailto:kgalligan@xxxxxxxxx>> wrote:
     > I'm approaching the end of my rope here.  I have a large database.
     > 250 million rows (ish).  Each row has potentially about 500 pieces of
     > data, although most of the columns are sparsely populated.

    A couple of notes here.  PostgreSQL stores null values as a single bit
    in a bit field, making sparsely populated tables quite efficient as
    long as you store the non-existent values as null and not '' or some
    other real value.

    Have you run explain analyze on your queries yet?  Pick a slow one,
    run explain analyze on it and post it and we'll see what we can do.


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux