David Rysdam <drysdam@xxxxxxxxxx> wrote: > We have a by-our-standards large table (about 40e6 rows). Since it is > the bottleneck in some places, I thought I'd experiment with > partitioning. In my personal experience I have gone into hundreds of millions of rows with good performance without partitioning. It's all about designing good indexes for the workload. I have only seen partitioning help in two cases: (1) There will be bulk deletes of rows, and you know at insert time which bulk delete the row belongs with. Dropping a partition table is a very fast way to delete a large number of rows. (2) The bulk of activity will be on a relatively small subset of the rows at any one time, and you can partition such that the set of active rows will be in a small number of partitions. In all other cases, I have only seen partitioning harm performance. There is no reason to think that checking the table-level constraints on every partition table will be faster than descending through an index tree level. > The table holds data about certain objects, each of which has an object > number and some number of historical entries (like account activity at a > bank, say). The typical usage pattern is: relatively rare inserts that > happen in the background via an automated process (meaning I don't care > if they take a little longer) and frequent querying, including some > where a human is sitting in front of it (i.e. I'd like it to be a lot > faster). > > Our most frequent queries either select "all history for object N" or > "most recent item for some subset of objects". > > Because object number figure so prominently, I thought I'd partition on > that. To me, it makes the most sense from a load-balancing perspective Load balancing? Hitting a single partition more heavily improves your cache hit ratio. What sort of benefit are you expecting from spreading the reads across all the partitions? *Maybe* that could help if you carefully placed each partition table on a separate set of spindles, but usually you are better off having one big RAID so that every partition is spread across all the spindles automatically. > Lower numbers are going to be queried much less often than higher > numbers. This suggests to me that you *might* get a performance boost if you define partitions on object number *ranges*. It still seems a bit dubious, but it has a chance. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general