Re: partitions vs indexes

"Enrico Thierbach" <eno@xxxxxxxxxxxx> · Thu, 03 Oct 2019 12:29:34 +0200

On 2 Oct 2019, at 22:16, Michael Lewis wrote:

"I would like to convert a table with a primary key into a partitioned setup by a column which is not part of the primary key"

That isn't possible. The partition key must be contained by the primary key. That is, the primary key could be site_id, id and you can create hash partition on id or site_id but not created_on.

You could drop primary key and foreign keys and implement them via trigger functions as described in this blog series, but it seems questionable-
https://www.depesz.com/2018/10/31/foreign-key-to-partitioned-table-part-2/

I do not assume the restriction would be dropped in future releases. I don't know that scanning all the partitions to figure out whether the primary key is violated would be advisable. Which is what the trigger functions described in the blog post has to do, right?

It might be noteworthy that partitioning with more than 10-100 partitions is MUCH faster in PG12 than PG11 (up to 4-8 thousand partitions) from testing shared by those working on that code.

Michael, thank you for your response. I think I now grasp the idea: if there is a uniqueness constraint, then the database would rather not visit all partitions to check for constraint violation, but want to identify the single partition that might fail that; hence any partitioning value must be a subset of or be identical to the uniqueness constraint. 

I get that this makes sense if you have many partitions; however, I basically want to end up with two partitions, “hot” and “cold”. A row’s lifetime starts in a hot partition, and, after being processed, moves into the cold partition. 

Most of the work actually happens in the hot partition, so I think having this as small as possible is probably helpful. For numbers: the hot partition would tyically contain ~10000 rows, the cold partition, on the other hand, will have 10s of millions. At the same time I still want to be able to look up a row by its id in the root relation, not in the concrete partitions. Having the database validate a uniqueness constraint in two tables instead of in one would be a worthwhile sacrifice for me.

Having said that I just realized I could probably reach my goal by setting up explicit hot and cold tables, move rows around manually whenever their “hotness” changes, and set up a view which combines both tables into a single relation. I would only have to drop all explicit FK references from the schema. A downside, certainly, but one that I could live with.

Best,

/eno

--

me on github: http://github.com/radiospiel