Re: Partition table in 9.0.x?

AJ Weber <aweber@xxxxxxxxxxx> · Tue, 08 Jan 2013 11:45:49 -0500

It probably does, but from psql command line, you can do \d+ and \di+
\d+ doesn't appear to display any size information.

If you have little control over your storage and are already IO bound, 
and the tables are growing rapidly, you may need to rethink that 
"deletes are rare" bit.  So the inserts and updates do target a hot 
part, while the selects are evenly spread?

In that case, it is very important to know if the slow part are the 
selects, or the insert and deletes.  If the selects are slow, and the 
hot rows for selects can't be gathered together into a hot partition, 
then after clustering they will still be slow as the disk will still 
have to seek all over the place (massive data-mining type selects 
might be an exception to that, but I wouldn't count on it).
Since order_num is sequential, I could partition on it in broad 
(sequential) ranges.  That would put all recent/new rows in one 
table-partition that would be a fraction of the size of the overall 
(unpartitioned) table.  I guess that would require manual maintenance 
over-time (to switch to another, new partition as each grows).

I think it is a matter of semantics. A small table is poor candidate 
for partitioning even if it has an excellent key to use for 
partitioning.  A large table could be a good candidate up until you 
realize it doesn't have a good key to use, at which point it stops 
being a good candidate (in my opinion).

My first idea to evenly-partition the table was to use the order_num and 
do a "mod" on it with the number of tables I wanted to use.  That would 
yield a partition-table number of 0-mod, and all rows for the same order 
would stay within the same partition-table.  However, you're right in 
thinking that a search for orders could -- really WOULD -- require 
retrieving details from multiple partitions, probably increasing IO.  So 
maybe the sequential partitioning (if at all) is better, just more 
maintenance down-the-road.

Was the order_num (from the parent table) the leading field of the 4 
column PK?  If not, you might want to reorder the PK so that it is the 
leading field and cluster again.  Or if reordering the PK columns is 
not convenient, make a new index on the order_num and cluster on that 
(perhaps dropping the index after the cluster, if it no longer serves 
a purpose)

Yes, the order_num is the first column in the PK, and our main browse 
queries use, at a minimum, the first 2-3 columns in that PK in their 
where-clause.

Many thanks again for all the input!
-AJ

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance