On 19/04/11 23:56, Phoenix Kiula wrote: > While I fix some bigger DB woes, I have learned a lesson. Huge indexes > and tables are a pain. > > Which makes me doubly keen on looking at partitioning. > > Most examples I see online are partitioned by date. As in months, or > quarter, and so on. This doesn't work for me as I don't have too much > logic required based on time. > > The biggest, highest volume SELECT in my database happens through an > "alias" column. This is an alphanumeric column. The second-biggest > SELECT happens through the "userid" column -- because many users check > their account every day. If user id -> alias and/or alias -> user id lookups are really "hot", consider moving them to a subtable, so you don't have to worry about whether to partition by user id or alias, and so that the table is really small, easily cached, and fast to scan. For example: CREATE TABLE user_alias ( alias VARCHAR(42) PRIMARY KEY, user_id integer REFERENCES maintable(id) ); If you like you can retain the "alias" column in "maintable", making that a REFERENCE to user_alias(alias) so you force a 1:1 relationship and don't have to JOIN on user_alias to get alias data for a user. The downside of that is that the circular/bidirectional reference requires you to use 'DEFERRABLE INITIALLY DEFERRED' on one or both references to be able to insert, and that can cause memory use issues if you do really big batch inserts and deletes on those tables. > 1. Which column should I partition by -- the "alias" because it's the > largest contributor of queries? This should be OK, but my concern is > that when user_id queries are happening, then the data for the same > user will come through many subtables that are partitioned by "alias" See above: consider splitting the user-id-to-alias mapping out into another table. > 3. If I partition using "a%", "b%" etc up to "z%" as the partition > condition, is this an issue It might be worth examining the distribution of your data and partitioning on constraints that distribute the data better. There'll be a lot more "c"s than "z"s. That said, it might not be worth the complexity and you'd have to check if the constraint exclusion code was smart enough to figure out the conditions. I don't have much experience with partitioning and have never tried or tested partitioning on a LIKE pattern. > 6. Triggers - how do they affect speed? A constraint is not a trigger, they're different. SELECTs on partitioned tables are not affected by triggers. For INSERT, UPDATE and DELETE, where you're redirecting INSERTs into the parent table into the appropriate partition, then speed might be a concern. It probably doesn't matter. If you find it to be an issue, then rather then re-writing the trigger in C, you're probably better off just INSERTing directly into the appropriate subtable and thus bypassing the trigger. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general