On Tue, Jan 8, 2013 at 8:45 AM, AJ Weber <aweber@xxxxxxxxxxx> wrote: > >> >> It probably does, but from psql command line, you can do \d+ and \di+ > > \d+ doesn't appear to display any size information. It does if you use it without an argument, to display all the tables in the search path: jjanes=# \d+ List of relations Schema | Name | Type | Owner | Size | Description --------+------------------+-------+--------+---------+------------- public | pgbench_accounts | table | jjanes | 128 MB | public | pgbench_branches | table | jjanes | 40 kB | public | pgbench_history | table | jjanes | 0 bytes | public | pgbench_tellers | table | jjanes | 40 kB | (4 rows) It rather annoys me that you actually get less information (no size, no owner) when you use \d+ on a named table. I don't know if there is a reason for that feature, or if it was just an oversight. > > >> >> If you have little control over your storage and are already IO bound, and >> the tables are growing rapidly, you may need to rethink that "deletes are >> rare" bit. So the inserts and updates do target a hot part, while the >> selects are evenly spread? >> >> In that case, it is very important to know if the slow part are the >> selects, or the insert and deletes. If the selects are slow, and the hot >> rows for selects can't be gathered together into a hot partition, then after >> clustering they will still be slow as the disk will still have to seek all >> over the place (massive data-mining type selects might be an exception to >> that, but I wouldn't count on it). > > Since order_num is sequential, I could partition on it in broad (sequential) > ranges. That would put all recent/new rows in one table-partition that > would be a fraction of the size of the overall (unpartitioned) table. I > guess that would require manual maintenance over-time (to switch to another, > new partition as each grows). Yep. If your selects are concentrated in those recent/new, this could be very useful. But, if your selects are not concentrated on the recent/new rows, the benefit would be small. > > >> >> I think it is a matter of semantics. A small table is poor candidate for >> partitioning even if it has an excellent key to use for partitioning. A >> large table could be a good candidate up until you realize it doesn't have a >> good key to use, at which point it stops being a good candidate (in my >> opinion). >> > My first idea to evenly-partition the table was to use the order_num and do > a "mod" on it with the number of tables I wanted to use. That would yield a > partition-table number of 0-mod, The current constraint exclusion code is quite simple-minded and doesn't know how to make use of check constraints that use the mod function, so the indexes of all partitions would have to be searched for each order_num-driven query, even though we know the data could only exist in one of them. The constraint exclusion codes does understand check constraints that involve ranges. There could still be some benefit as the table data would be concentrated, even if the index data is not. > and all rows for the same order would stay > within the same partition-table. But usually a given order_num would only be of interest for a fraction of a second before moving on to some other order_num of interest, so by the time the relevant partition become fully cached, it would no longer be hot. Or, if the partitions were small enough, you could assume that all rows would be dragged into memory when the first one was requested because they lay so close to each other. But it is not feasible to have a large enough number of partitions to make that happen. But if the table is clustered, this is exactly what you would get--the trouble would be keeping it clustered. If most of the line-items are inserted at the same time as each other, they probably should be fairly well clustered to start with. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance