max time in a table query takes ages

"Grzegorz Jaśkiewicz" <gryzman@xxxxxxxxx> · Thu, 23 Oct 2008 11:22:43 +0100

hey folks
I have a simple query over a fairly simple query here, that scans for max date in a table that's fairly hudge (300M rows). there's index on that field that's being used, but for whatever reason, it takes ages. Ideas ?

 select date_trunc('day', max(data)) into dt from staticstats where processed = false
explain analyze:
                                                                                   QUERY PLAN 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=3.89..3.90 rows=1 width=0) (actual time=2558459.883..2558459.884 rows=1 loops=1)

   InitPlan
     ->  Limit  (cost=0.00..3.89 rows=1 width=8) (actual time=2558362.751..2558362.753 rows=1 loops=1)
           ->  Index Scan Backward using sstats_date_idx on staticstats  (cost=0.00..1566198296.88 rows=402561795 width=8) (actual time=2558362.747..2558362.747 rows=1 loops=1)

                 Filter: ((data IS NOT NULL) AND (NOT processed))
 Total runtime: 2558540.800 ms
(6 rows)

Time: 2558545.012 ms

one thing I am amazed by, is the filter data is not null, well - take a look at the schema here:
staty=> \d+ staticstats
                                           Table "public.staticstats"
  Column   |              Type              |                      Modifiers                       | Description

-----------+--------------------------------+------------------------------------------------------+-------------
 data      | timestamp(0) without time zone | not null                                             |

 size      | integer                        | not null default 0                                   |
 proto     | integer                        | not null                                             |
 macfrom   | integer                        | not null                                             |

 macto     | integer                        | not null                                             |
 processed | boolean                        | not null default false                               |
 id        | bigint                         | not null default nextval('sstatic_id_seq'::regclass) |

Indexes:
    "blah123s" PRIMARY KEY, btree (macto, data, proto, macfrom)
    "sstats_id_idx" UNIQUE, btree (id)
    "sstats_date_idx" btree (data)
    "staticstat_processed_idxs" btree (processed)

Foreign-key constraints:
    "staty_fk1s" FOREIGN KEY (macfrom) REFERENCES macs(id)
    "staty_fks" FOREIGN KEY (macto) REFERENCES macs(id)
Has OIDs: no

it takes ms if there's somethign that's been recently added to that table. The table itself is vacuumed/analyzed quite often, and more or less clustered by sstats_date_idx - althrough in that instance, I wasn't able to recluster it - because there's not enough disc space (only 45GB free, and for whatever reason - even tho the table is only about 25GB in size - postgresql requires more than 40GB of space to recluster it).
any hints please ?
-- 
GJ