Greetings Postgres Developers,
I've recently started taking advantage of the PARTITION BY HASH feature for my database system. It's a really great fit since my tables can get quite large (900M+ rows for some) and splitting them up into manageable chunks should let me upload to them without having to update an enormous index every time. What's more, since each partition has a write lock independent of the parent table, it should theoretically be possible to perform a parallelized insert operation, provided the data to be added is partitioned beforehand.
What has been disappointing is that the query planner doesn't seem to recognize this potential. For example, if I have a large list of input data, and I want to perform a select operation across the target table:
-- target table is hashed on 'textfield' & has a unique index on 'textfield'
select * from temp_data td left join target tg on td.textfield = tg.textfield;
I would expect to get a query plan like this:
partition temp_data
parallel scan on
target_p0 using target_p0_textfield_uniq_idx against temp_data_p0
target_p1 using target_p1_textfield_uniq_idx against temp_data_p1
target_p2 using target_p2_textfield_uniq_idx against temp_data_p2
...
Instead, I get a seemingly terrible plan like this:
hash temp_data
sequential scan on
target_p0 against temp_data
target_p1 against temp_data
target_p2 against temp_data
...
It doesn't even make use of the index on the textfield! Instead, it opts to hash all of temp_data and perform a sequential scan against it.
It doesn't help if I partition temp_data by textfield beforehand either (using the same scheme as the target table). It still opts to concatenate all of temp_data, hash it, then perform a sequential scan against the target partitions.
On insert the behaviour is better but it still opts for a sequential insert instead of a parallel one.
Does the query planner know something I don't? It's my intuition that it should be faster to do a rough counting sort (partition by hash) first, and then do N smaller more accurate sorts in parallel afterwards.
Currently I am creating a custom script(s) to emulate my desired behaviour, but it would be nice if there was a way to get the query planner to do this automatically. Any tricks to do this would be much appreciated!
-Ben