Re: Adding nextval() to a select caused hang/very slow execution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



OK - I see.  And to add insult to injury, I tried creating a temporary table to store the intermediate results.  Then I was going to just do an insert... select... to insert the rows.   That would de-couple the nextval() from the query.

Strangely, the first query I tried it on worked great.  But, when I tried to add a second set of data with a similar query to the same temporary table, it slowed right down again.  And, of course, when I remove the insert, it's fine.  

And, of course, your explanation that inserts will not be parallelized must be the reason.  I will certainly re-vacuum the tables.  I wonder why auto-vacuum didn't collect better stats.  vacuum  analyze <table> is all I need, right?

As a last resort, what about a PL/PGSQL procedure loop on the query result?  Since the insert is very few rows relative to the work the select has to do, I could just turn the insert.. select.. into a for loop.  Then the select could be parallel?

What do you think?


On Wed, Nov 4, 2020 at 2:01 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Eric Raskin <eraskin@xxxxxxxxxxxx> writes:
> And, to follow up on your question, the plan shape DOES change when I
> add/remove the nextval() on a plain explain.
> Without nextval():  https://explain.depesz.com/s/SCdY
> With nextval():  https://explain.depesz.com/s/oLPn

Ah, there's your problem, I think: the plan without nextval() is
parallelized while the plan with nextval() is not, because nextval() is
marked as parallel-unsafe.  It's not immediately clear why that would
result in more than about a 4X speed difference, given that the parallel
plan is using 4 workers.  But some of the rowcount estimates seem fairly
far off, so I'm betting that the planner is just accidentally lighting on
a decent plan when it's using parallelism while making some poor choices
when it isn't.

The reason for the original form of your problem is likely that we don't
use parallelism at all in non-SELECT queries, so you ended up with a bad
plan even though the nextval() was hidden in a trigger.

What you need to do is get the rowcount estimates nearer to reality
--- those places where you've got estimated rowcount 1 while reality
is tens or hundreds of thousands of rows are just disasters waiting
to bite.  I suspect most of the problem is join conditions like

Join Filter: (CASE WHEN (c.rtype = ANY ('{0,1,7,9}'::bpchar[])) THEN c.rtype ELSE x.rtype END = '2'::bpchar)

The planner just isn't going to have any credible idea how selective
that is.  I wonder to what extent you could fix this by storing
generated columns that represent the derived conditions you want to
filter on.

                        regards, tom lane


--

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Eric H. Raskin                                                                                                      914-765-0500 x120 or 315-338-4461 (direct)

Professional Advertising Systems Inc.                                                                     fax: 914-765-0500 or 315-338-4461 (direct)

3 Morgan Drive #310                                                                                           eraskin@xxxxxxxxxxxx

Mt Kisco, NY 10549                                                                                              http://www.paslists.com


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux