I've upped max_worker_processes to 16, but I still can't get it to launch workers unless I use EXPLAIN ANALYZE. I've also found that this simplified setup exhibits the same behavior:
create table big as (
SELECT generate_series(1,30000000) AS id
);
explain analyze SELECT avg(id) from big where id % 17 = 0;
With explain analyze, the plan includes:
Workers Planned: 4
Workers Launched: 4
But without the explain analyze, it never launches workers:
Workers Planned: 4
Workers Launched: 0
I've tried this on a CentOS VM (VirtualBox on Windows) that I gave 2 cores, and it worked as expected (it launched workers with and without explain analyze), so I've only been able to reproduce this on Windows.
Thanks,
Jay K
On Thu, Sep 29, 2016 at 9:07 PM David Rowley <david.rowley@xxxxxxxxxxxxxxx> wrote:
On 30 September 2016 at 10:47, Jay Knight <jay@xxxxxxxxxxxxx> wrote:
>>What's max_worker_processes set to?
>
> 8
>
>>One theory would be that, the worker might not have been available
>>when you performed the query execution, but it just happened to be
>>when you did the EXPLAIN ANALYZE
>
> This happens consistently this way. Every time I run it with explain
> analyze it uses parallel workers, and every time I run it without it
> doesn't. I just enabled auto_explain, and see that it is "planning" to use
> a worker, but doesn't launch it. Is there a way to know why a planned
> worker doesn't get launched?
It seems not, at least not unless you're willing to attach a debugger.
> 2016-09-29 16:45:44 CDT LOG: duration: 50703.595 ms plan:
> Query Text: select count(*) from t1
> join t2 on t1.item = t2.item and t1.low <= t2.high and t1.high >= t2.low
> Finalize Aggregate (cost=14609058.99..14609059.00 rows=1 width=8)
> (actual time=50703.584..50703.584 rows=1 loops=1)
> -> Gather (cost=14609058.88..14609058.99 rows=1 width=8) (actual
> time=50703.581..50703.581 rows=1 loops=1)
> Workers Planned: 1
> Workers Launched: 0
> -> Partial Aggregate (cost=14608958.88..14608958.89 rows=1
> width=8) (actual time=50703.579..50703.579 rows=1 loops=1)
> -> Nested Loop (cost=0.42..13608937.28 rows=400008641
> width=0) (actual time=0.534..50577.673 rows=3669891 loops=1)
> -> Parallel Seq Scan on t1 (cost=0.00..3386.71
> rows=176471 width=12) (actual time=0.041..18.351 rows=300000 loops=1)
> -> Index Only Scan using t2_item_low_high_idx on t2
> (cost=0.42..63.77 rows=1333 width=12) (actual time=0.167..0.168 rows=12
> loops=300000)
> Index Cond: ((item = t1.item) AND (low <=
> t1.high) AND (high >= t1.low))
> Heap Fetches: 0
That's interesting.
How about trying to increase max_worker_processes to say, 16. I know
you've said you've tried multiple times and it seems consistent, but
increasing this seems to be worth a try, if anything, to rule that
out.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services