Re: Adding nextval() to a select caused hang/very slow execution

Eric Raskin <eraskin@xxxxxxxxxxxx> · Wed, 4 Nov 2020 14:12:07 -0500

OK - I see.  And to add insult to injury, I tried creating a temporary table to store the intermediate results.  Then I was going to just do an insert... select... to insert the rows.   That would de-couple the nextval() from the query.
Strangely, the first query I tried it on worked great.  But, when I tried to add a second set of data with a similar query to the same temporary table, it slowed right down again.  And, of course, when I remove the insert, it's fine.  

And, of course, your explanation that inserts will not be parallelized must be the reason.  I will certainly re-vacuum the tables.  I wonder why auto-vacuum didn't collect better stats.  vacuum  analyze <table> is all I need, right?

As a last resort, what about a PL/PGSQL procedure loop on the query result?  Since the insert is very few rows relative to the work the select has to do, I could just turn the insert.. select.. into a for loop.  Then the select could be parallel?

What do you think?

On Wed, Nov 4, 2020 at 2:01 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Eric Raskin <eraskin@xxxxxxxxxxxx> writes:

> And, to follow up on your question, the plan shape DOES change when I

> add/remove the nextval() on a plain explain.

> Without nextval():  https://explain.depesz.com/s/SCdY

> With nextval():  https://explain.depesz.com/s/oLPn

Ah, there's your problem, I think: the plan without nextval() is

parallelized while the plan with nextval() is not, because nextval() is

marked as parallel-unsafe.  It's not immediately clear why that would

result in more than about a 4X speed difference, given that the parallel

plan is using 4 workers.  But some of the rowcount estimates seem fairly

far off, so I'm betting that the planner is just accidentally lighting on

a decent plan when it's using parallelism while making some poor choices

when it isn't.

The reason for the original form of your problem is likely that we don't

use parallelism at all in non-SELECT queries, so you ended up with a bad

plan even though the nextval() was hidden in a trigger.

What you need to do is get the rowcount estimates nearer to reality

--- those places where you've got estimated rowcount 1 while reality

is tens or hundreds of thousands of rows are just disasters waiting

to bite.  I suspect most of the problem is join conditions like

Join Filter: (CASE WHEN (c.rtype = ANY ('{0,1,7,9}'::bpchar[])) THEN c.rtype ELSE x.rtype END = '2'::bpchar)

The planner just isn't going to have any credible idea how selective

that is.  I wonder to what extent you could fix this by storing

generated columns that represent the derived conditions you want to

filter on.

                        regards, tom lane

-- 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Eric H. Raskin                                                                                                      914-765-0500 x120 or 315-338-4461 (direct)
Professional Advertising Systems Inc.                                                                     fax: 914-765-0500 or 315-338-4461 (direct)
3 Morgan Drive #310                                                                                           eraskin@xxxxxxxxxxxx
Mt Kisco, NY 10549                                                                                              http://www.paslists.com