On 3/22/23 12:09, Bryn Llewellyn wrote:
laurenz.albe@xxxxxxxxxxx <mailto:laurenz.albe@xxxxxxxxxxx> wrote:
...I understand that you ask questions to gain deeper understanding.
bryn@xxxxxxxxxxxx <mailto:bryn@xxxxxxxxxxxx> wrote:
...I had never come across use cases where [scrollability] was
beneficial. I wanted, therefore, to hear about some. I thought that
insights here would help me understand the mechanics.
I recently used cursor scrollability, so I can show you a use case:
github.com/cybertec-postgresql/db_migrator/blob/master/db_migrator--1.0.0.sql#L49 <http://github.com/cybertec-postgresql/db_migrator/blob/master/db_migrator--1.0.0.sql#L49>
Thanks for the link to your SQL file at the line where you get the row
count in the way that you describe. I saw that this is in the PL/pgSQL
source text for function "materialize_foreign_table()" (~200 lines). And
I saw that you use the cursor mechanism that we're discussing here in
only one other function, "db_migrate_refresh()" (~480 lines). But this
second one doesn't use the "move forward all" device to get a row count.
I looked only at these two functions.
I noted that neither of these functions has a refcursor formal argument
and that, rather, you open (i.e. create) each of the three cursors that
you use within the two functions that uses them. I noted, too, that for
the three "select" statements that you use to open your refcursors, none
of these has an "order by". I noted that your code implements "create
table destination" and "insert into destination... select
from source..." where order doesn't matter.
CREATE FUNCTION materialize_foreign_table(
schema name,
table_name name,
with_data boolean DEFAULT TRUE,
pgstage_schema name DEFAULT NAME 'pgsql_stage'
) RETURNS boolean
LANGUAGE plpgsql VOLATILE STRICT SET search_path = pg_catalog AS
$$DECLARE
ft name;
stmt text;
errmsg text;
detail text;
cur_partitions refcursor;
cur_subpartitions refcursor;
...
CREATE FUNCTION db_migrate_refresh(
plugin name,
staging_schema name DEFAULT NAME 'fdw_stage',
pgstage_schema name DEFAULT NAME 'pgsql_stage',
only_schemas name[] DEFAULT NULL
) RETURNS integer
LANGUAGE plpgsql VOLATILE CALLED ON NULL INPUT SET search_path =
pg_catalog AS
$$DECLARE
extschema text;
old_msglevel text;
v_plugin_schema text;
v_create_metadata_views regproc;
v_translate_datatype regproc;
v_translate_identifier regproc;
v_translate_expression regproc;
c_col refcursor
...
What is not formal about the above?
Though it does not matter as, back to the docs again:
https://www.postgresql.org/docs/current/plpgsql-cursors.html
"All access to cursors in PL/pgSQL goes through cursor variables, which
are always of the special data type refcursor. One way to create a
cursor variable is just to declare it as a variable of type refcursor.
Another way is to use the cursor declaration syntax, which in general is:
name [ [ NO ] SCROLL ] CURSOR [ ( arguments ) ] FOR query;
"
Though maybe you are trying to differentiate between bound and unbound
refcursor's, where the ones defined in the functions are unbound.
Also order by is not relevant for getting a count.
Finally, I see how an understanding of internals helps the understanding
of performance-related question. But I find it very hard to accept that
I should read the C implementation of PostgreSQL in order to get the
proper mental model, and the proper terms of art, that I need to
understand it semantics (and the corresponding SQL and PL/pgSQL syntax).
Having said that, I did a little test an saw that this:
move last in cur;
get diagnostics n = row_count;
reported just 1—in contrast to what you used:
move forward all in cur;
Because they are different actions.
The first directly moves to the last row in the cursor and counts just
that row
The second moves through all the rows left in the cursor and counts all
the rows it traversed.
It is spelled out here:
https://www.postgresql.org/docs/current/sql-move.html
"The count is the number of rows that a FETCH command with the same
parameters would have returned (possibly zero)."
I've no idea how it's possible to navigate to the last result in the set
without knowing how many there are. Maybe that fact is there
internally—but with no explicit SQL or PL/pgSQL exposure of the value.
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx