Okay after reading http://rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html I do see that I was using force_parallel_mode incorectly and wouldn't have gotten what I wanted even if the original query was possible to parallelize.
> Maybe, but unfairness multiplies if it's part of a larger plan
Ah, I didn't think of that, and it's a good point.
> Ok I hacked my copy of PostgreSQL to let me set parallel_setup_costs
> to negative numbers ...
Thanks for taking the time to do that and look into that. I don't actually think it's worth the confusion to allow this in general, but I was thinking that setting "force_parallel_mode = on" would essentially be doing something equivalent to this (though I now see that is wrong).
> But it's probing every index for every one of the values in the big
> list, not just the ones that have a non-zero chance of finding a
> match, which is a waste of cycles.
In my case, this would actually be quite helpful because the real bottleneck when I run this in production is time spent waiting for IO. I was hoping to spread that IO wait time over multiple threads, and wouldn't really care about the few extra wasted CPU cycles. But I can't actually do this as I can't set parallel_setup_costs to be negative, so I wouldn't be able to get PG to choose the parallel plan even if I did partition the table.
> If I had more timerons myself ...
If only we all had more timerons ... :)
Thanks,
Alex Kaiser
> Maybe, but unfairness multiplies if it's part of a larger plan
Ah, I didn't think of that, and it's a good point.
> Ok I hacked my copy of PostgreSQL to let me set parallel_setup_costs
> to negative numbers ...
Thanks for taking the time to do that and look into that. I don't actually think it's worth the confusion to allow this in general, but I was thinking that setting "force_parallel_mode = on" would essentially be doing something equivalent to this (though I now see that is wrong).
> But it's probing every index for every one of the values in the big
> list, not just the ones that have a non-zero chance of finding a
> match, which is a waste of cycles.
In my case, this would actually be quite helpful because the real bottleneck when I run this in production is time spent waiting for IO. I was hoping to spread that IO wait time over multiple threads, and wouldn't really care about the few extra wasted CPU cycles. But I can't actually do this as I can't set parallel_setup_costs to be negative, so I wouldn't be able to get PG to choose the parallel plan even if I did partition the table.
> If I had more timerons myself ...
If only we all had more timerons ... :)
Thanks,
Alex Kaiser
On Wed, Feb 1, 2023 at 6:12 PM David Rowley <dgrowleyml@xxxxxxxxx> wrote:
On Thu, 2 Feb 2023 at 14:49, Thomas Munro <thomas.munro@xxxxxxxxx> wrote:
> If I had more timerons myself, I'd like to try to make parallel
> function scans, or parallel CTE scans, work...
I've not really looked in detail but I thought parallel VALUES scan
might be easier than those two.
David