Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?

Bryn Llewellyn <bryn@xxxxxxxxxxxx> · Tue, 14 Mar 2023 20:29:35 -0700

adrian.klaver@xxxxxxxxxxx wrote:

bryn@xxxxxxxxxxxx wrote:
Section "43.7. Cursors” in the PL/pgSQL chapter of the doc (www.postgresql.org/docs/current/plpgsql-cursors.html#PLPGSQL-CURSOR-DECLARATIONS) starts with this:
«
Rather than executing a whole query at once, it is possible to set up a cursor that encapsulates the query, and then read the query result a few rows at a time. One reason for doing this is to avoid memory overrun when the result contains a large number of rows. (However, PL/pgSQL users do not normally need to worry about that, since FOR loops automatically use a cursor internally to avoid memory problems.) A more interesting usage is to return a reference to a cursor that a function has created, allowing the caller to read the rows. This provides an efficient way to return large row sets from functions.
»
On its face, it seems to make sense. And I’ve written a few proof-of-concept tests. For example, I wrote a “security definer” function that's owned by a role that can select from the relevant table(s) that returns refcursor. And I called it from a subprogram that's owned by a role that cannot select from the relevant table(s) to loop through the rows. But I can't convince myself that this division of labor is useful. And especially I can't convince myself that the "pipeling" capability is relevant in a three-tier app with a stateless browser UI. Here, the paradigm has the client-side app checking out a connection from the pool, generating the entire response to the end-user's request, releasing the connection, and sending the response back to the browser. This paradigm isn't consistent with allowing the end user to navigate forwards and backwards in a scrollable cursor that is somehow held in its open state in in the sever by the the middle tier client on behalf of a browser session that comes back time and again to its dedicated middle tier client and thence yo its dedicated database server session. (Anyway, without anything like Oracle PL/SQL's packages, you have no mechanism to hold the opened cursor variable between successive server calls.)

I guess that would depend on how you define a server call:

www.postgresql.org/docs/current/plpgsql-cursors.html#PLPGSQL-CURSOR-USING

"Once a cursor has been opened, it can be manipulated with the statements described here.

These manipulations need not occur in the same function that opened the cursor to begin with. You can return a refcursor value out of a function and let the caller operate on the cursor. (Internally, a refcursor value is simply the string name of a so-called portal containing the active query for the cursor. This name can be passed around, assigned to other refcursor variables, and so on, without disturbing the portal.)

All portals are implicitly closed at transaction end. Therefore a refcursor value is usable to reference an open cursor only until the end of the transaction."

Is it fair to say that the PL/pgSQL refcursor is useful, at best, only in very special use-cases?

Ah… I see. I had read this wrongly:

« Internally, a refcursor value is simply the string name of a so-called portal containing the active query for the cursor. This name can be passed around, assigned to other refcursor variables, and so on, without disturbing the portal. »

I thought that it was an "under the hood" explanation and that the only thing that you could see after entry into the PL/pgSQL subprogram that will be the top of the stack would be an opaque value (sometimes called a handle in other contexts) that was accessible only from PL/pgSQL.

I hadn't yet tried this in psql:

create schema s;
create table s.t(k int primary key, v text not null);
insert into s.t(k, v) values (1, 'cat'), (2, 'dog');

create function s.f(k_in in int)
  returns refcursor
  language plpgsql
as $body$
declare
  s_f_cur cursor(k_in int) for select v from s.t where k = k_in;
begin
  open s_f_cur(k_in);
  return s_f_cur;
end;
$body$;

create function s.g(cur in refcursor)
  returns text
  language plpgsql
as $body$
declare
  v text;
begin
  fetch cur into v;
  return v;
end;
$body$;

begin;
select ''''||s.f(1)||'''' as cur
\gset
select s.g(:cur) as result;
end;

I just did. And the result of "select s.g(:cur)" was the expected "cat".

It requires a leap of imagination, or help from the pgsql-general list, to get to this. So thanks!

Might the doc add an example like this?

Of course, it all falls into place now. I can see how I could write a client app in, say, Python to write a humongous report to a file by fetching manageably-sized chunks, time and again until done with a function like my "g()" here, from a cursor that I'd opened using a function like my "f()".

B.t.w., when I said "top-level call", I meant the SQL statement that a client issues—in this case most likely "select my_plpgsql_function()" or "call my_plpgsql_procedure()". That top-of-stack subprogram can invoke other subprograms and so on ad infinitum. But eventually the whole stack empties and control passes back to the client. But all that falls away now with the exampe I showed in place.