Re: PG optimization question

Nickolay <nitro@xxxxxxxxxxx> · Sat, 09 Jan 2010 21:50:07 +0300

Okay, I see your point with staging table. That's a good idea!
The only problem I see here is the transfer-to-archive-table process. As 
you've correctly noticed, the system is kind of a real-time and there 
can be dozens of processes writing to the staging table, i cannot see 
how to make the transfer/flush process right and clear...

Pierre Frédéric Caillaud wrote:
Oh, btw, 95% of queries are searching rows for current date (last 
24 hours).

You may want to use a daily staging table and then flush to the 
monthly archive tables at the end of the day.

    If the rows in the archive tables are never updated, this strategy 
means you never need to vacuum the big archive tables (and indexes), 
which is good. Also you can insert the rows into the archive table in 
the order of your choice, the timestamp for example, which makes it 
nicely clustered, without needing to ever run CLUSTER.

    And with partitioning you can have lots of indexes on the staging 
table (and current months partition) (to speed up your most common 
queries which are likely to be more OLTP), while using less indexes on 
the older partitions (saves disk space) if queries on old partitions 
are likely to be reporting queries which are going to grind through a 
large part of the table anyway.

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance