Re: Can we get rid of repeated queries from pg_dump?

Adrian Klaver <adrian.klaver@xxxxxxxxxxx> · Fri, 27 Aug 2021 14:53:01 -0700

On 8/27/21 2:23 PM, Tom Lane wrote:
Laurenz Albe <laurenz.albe@xxxxxxxxxxx> writes:
On Thu, 2021-08-26 at 18:06 +0200, hubert depesz lubaczewski wrote:
In total, there were 5000 queries:
SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL)
But there were only 83 separate oids that were scanned.

That is a strong argument for using a hash table to cache the types.

Those queries are coming from getFormattedTypeName(), which is used
for function arguments and the like.  I'm not quite sure why Hubert
is seeing 5000 such calls in a database with only ~100 functions;
surely they don't all have an average of 50 arguments?

Could be.

From the stats post:

"Based on my reading of queries in order it seems to follow the pattern of:

One call for:

SELECT  proretset,  prosrc,  probin,  provolatile,  proisstrict, 
prosecdef,  lanname,  proconfig,  procost,  prorows, 
pg_catalog.pg_get_function_arguments(p.oid) AS funcargs, 
pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs, 
pg_catalog.pg_get_function_re
sult(p.oid) AS funcresult,  proleakproof,  array_to_string(protrftypes, 
' ') AS protrftypes,  proparallel,  prokind,  prosupport,  NULL AS 
prosqlbody  FROM pg_catalog.pg_proc p, pg_catalog.pg_language l  WHERE 
p.oid = 'SOME_NUMBER'::pg_catalog.oid AND l.oid = p.prolang

and then one or more:

SELECT pg_catalog.format_type('SOME_NUMBER'::pg_catalog.oid, NULL)

In one case, after proc query, there were 94 concecutive
pg_catalog.format_type queries.
"

I experimented with the attached, very quick-n-dirty patch to collect
format_type results during the initial scan of pg_type, instead.  On the
regression database in HEAD, it reduces the number of queries pg_dump
issues from 3260 to 2905; but I'm having a hard time detecting any net
performance change.

(This is not meant for commit as-is; notably, I didn't bother to fix
getTypes' code paths for pre-9.6 servers.  It should be fine for
performance testing though.)

			regards, tom lane

--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx