Dear all, I'm looking for ideas here, and it could be someone already stepped into declarative partitioning of an existing database where Hibernate (a Java ORM) handles the tables. The situation is as follows: create table foo( id primary key, a_date date, ... ); Now, the trivial way to partition this would be on a range based on a_date, so that the primary key of the tables shifts from id to (id, a_date). One thing that frightens me is that Hibernate does a lot of per-row lookups by means of the id, so while the partitioning is probably going to make things more manageable and even faster in some scenarios, could lead to drawbacks when Hibernate queries by id. Moreover, hibernate will think id is unique while it is not anymore. Last but not least, referencing foreign keys are made by Hibernate thru the id column, and it means that incoming foreign keys to foo will not be in place anymore. Now, I know that I can define a composite key in hibernate, in order to match the effective new data structure, but this requires a huge rewrite of the application code. And after all, we are talking about a non-PostgreSQL related piece, so the problem is not on the PostgreSQL side. Anyone has already done a partitioning in such scenario? I am thinking that partitioning on an hash of id could be the only way to go without having to touch the hibernate side, even if this would bring up a less balanced partitioned structure. In such case, I mean partitioning by hash, having a table with 60 millions rows per 50 GB in size, what would be the rule of thumb to select the number of partitions (i.e., a suggested modulus)?
We're in the exact same situation, using Hibernate and having many tables like foo, partitioned by a_date with PK of (id, a_date) and FK definitions (id, a_date).
It was a massive mistake, since many queries span partitions. Within a year, I "departitioned" all tables except two giant tables that have large bytea columns. (All the formerly partitioned tables still have (id, a_date) as PK and FK. I'd like to change that, but the amount of code change is untennable given them amount of new features that need to be added.)
Thus, my recommendations are:
1. only partition the tables you must, and
2. partition by id.
--
Born in Arizona, moved to Babylonia.
Born in Arizona, moved to Babylonia.