[...]
Also, there are more configuration settings related to parallel queries you might want to look into. Most notably:
parallel_setup_cost
parallel_tuple_cost
min_parallel_table_scan_size
Especially the last one is a typical dealbreaker, you can try to set it to 0 for the beginning. Good starters for the others are 500 and 0.1 respectively.
Aha!
By setting min_parallel_table_scan_size=0, Postgres uses the 6 workers I expect, and the execution time decreases nicely.
I
posted a clumsily-anonymized plan for the “bad” scenario here --
https://gist.github.com/osvenskan/ea00aa71abaa9697ade0ab7c1f3b705b
There
are 3 sort nodes in the plan. When I get the “bad” behavior, the sorts have one worker, when I get the good behavior, they have multiple workers (e.g. 6).
I also think, what Luis pointed out earlier might be a good option for you, i.e. setting
parallel_leader_participation = off;
And by the way, this 1 worker turns actually into 2 workers in total with leader participation enabled.
This
brings up a couple of questions —
1)
I’ve read that this is Postgres’ formula for the max # of workers it will consider for a table —
max_workers
= log3(table size / min_parallel_table_scan_size)
Does
that use the raw table size, or does the planner use statistics to estimate the size of the subset of the table that will be read before allocating workers?
"table size" is the number of PSQL pages, i.e. relation-size / 8 kB. This comes from statistics.
2)
There are 9 tables in this query ranging in size from 72Kb to 17Gb. Does Postgres decide on a table-by-table basis to allocate multiple workers, or…?
AFAIK, it will do it per-table initially but then the final result depends on the chosen gather node.
Thank
you so much for the suggestion, I feel un-stuck now that I have an idea to experiment with.
You are welcome, we are actually about to publish a blog post which has some more suggestions for parallelism in.