Search Postgresql Archives

Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/20/23 1:46 PM, Bryn Llewellyn wrote:
adrian.klaver@xxxxxxxxxxx wrote:

bryn@xxxxxxxxxxxx wrote:

adrian.klaver@xxxxxxxxxxx wrote:

I have a hard time fathoming why someone who writes documentation does not actually read documentation.


Ouch. In fact, I had read the whole of the "43.7. Cursors" section in the "PL/pgSQL" chapter. And the sections in the "SQL Commands" chapter for "declare", "fetch" and "close". But several of the key concepts didn't sink in and this prevented me not only from understanding what some of the examples showed but, worse, from being able to use the right vocabulary to express what confused me.

Given this from your original question:

« Anyway, without anything like Oracle PL/SQL's packages, you have no mechanism to hold the opened cursor variable between successive server calls. »

What part of this [a particular code example] did not make sense in that context?

First off, despite the fact that I've clearly annoyed you (for which I apologize), I have found these exchanges very helpful. So thank you very much.

Your questions can be summarized as "Why couldn't you understand the doc? And why did you mix questions about use-cases with questions about the mechanics?" The answer has to do with psychology. I probably can't explain this convincingly. That's why it's taken me a long time to respond. I also had to do lots of testing before responding to make sure that the mental model that I've formed for myself is consistent with these.

You may not be interested in what follows. But, anyway, here goes.

— I happen to have many years of experience with Oracle Database and PL/SQL. The latter has analogous features to PL/pgSQL's "refcursor". But the differences between the notions in the two environments are enormous. My attempt to understand the latter was hindered by my understanding of the former. I accept that this is *my* problem and that I could never expect that the PG doc would cater for such a reader.

— Oracle Database doesn't expose scrollability for PL/SQL's equivalent of "refcursor". So I had never come across use cases where this was beneficial. I wanted, therefore, to hear about some. I thought that insights here would help me understand the mechanics. But I didn't get anything beyond "Scrollability is what it is. If you don't need it, don't use it."

Anyway, never mind all that now. Here's what I now (think that) I understand—with some comments on what made it hard for me to grasp.

— The key notion is what is referred to sometimes as "portal" and sometimes as "cursor". This is the thing that's global within, and private to, a session, that's uniquely identified by a bare name, that, and that's listed in "pg_cursors". I believe that in typical use, a cursor has only transaction duration. But (and only when you use the SQL API) you can create a cursor with (up to) session duration

— The doc pages for the "declare", "fetch", and "close" SQL statements don't mention "portal" and use only "cursor". They use the term to mean the underlying phenomenon and use wording like: "DECLARE allows a user to create cursors"; "You can see all available cursors by querying the pg_cursors system view"; "FETCH retrieves rows using a previously-created cursor"; "CLOSE frees the resources associated with an open cursor. After the cursor is closed, no subsequent operations are allowed on it. A cursor should be closed when it is no longer needed." However, these pages leave the term "open" undefined, though it's used. It seems that it has no meaning. Rather, a cursor with a particular name either exists or not. You create it with "declare" and drop it with "close". And that's it. If "open" means anything, it's just another word for "exists". (The fact that "pg_cursors" doesn't have a boolean column called "open" supports this understanding.) The sentence "After the cursor is closed, no subsequent operations are allowed on it." is equivalent to "After a table is dropped, no subsequent operations are allowed on it." But who would bother to say that? Notice that "pg_cursors" has a column called "creation_time" — and not "declaration time".

— On the other hand, the doc page "43.7. Cursors" uses "portal" a lot—and never says that it means exactly the same as "cursor" qua term of art (and not qua keyword). It does say "...a so-called portal containing the active query for the cursor." This suggests a subtle difference in meaning between "portal" and "cursor" and a notion of containment. I can't make any sense of that. It says things like "Before a cursor can be used to retrieve rows, it must be opened. (This is the equivalent action to the SQL command DECLARE CURSOR.)" This is the closest that it comes to saying that the SQL API and the PL/pgSQL API both manipulate the same thing—what you see in "pg_cursors". The sentence that I quoted is equivalent to saying "Before you can insert a row into a table, the table has to exist." In other words, an unhelpful tautology. I believe that the sense is this: « A variable with the data type "refcursor" holds a bare name (which is governed by the usual rules for a SQL name). The name might be found in "pg_cursors" or it might not be. When, and only when, it is found in "pg_cursors", the refcursor variable acts as a handle to the denoted cursor and supports operations upon it using various PL/pgSQL statements that use the identifier for the refcursor variable's name.

— I (but maybe only I) would have appreciated being able to read a single generic account that explained the underlying concepts. This would have prepared me for understanding the operations that the SQL and PL/pgSQL APIs expose—and especially that they are interoperable. So I'd've liked to see a note at the start of the four relevant sections ("43.7. Cursors" and the "declare", "fetch", and "close" SQL statements) that x-ref'd to the generic account and said "read this first".

Here's some more detail of how I'd state the mental model that I've deduced. Please tell me if you think that some, or all, of my account is wrong. (When I say "cursor", I always mean what's listed in "pg_cursors". And I'll never mention "portal" because the term seems to means exactly the same as "cursor".)

(1) A cursor must have a defining "select" statement. It also always has a pointer to the next-to-be-fetched row in the result set that the "select" defines. The rows are (implicitly) numbered from 1 through N where N is the number of rows that the cursor's "select" defines. However (as you can see from "fetch :x" in SQL, where :x is less than 1 or more than N) the pointer can point outside of the result set and not cause an error.

(2) A cursor defines a read-consistent snapshot, as of its "pg_cursors.creation_time". (From the "declare" doc, « In PostgreSQL, all cursors are insensitive. ») The complete set of rows that the "select" defines may not all be concurrently materialized in the cursor. This implies some kind of aging out and replacement implementation. The details aren't described because they have no semantic significance.

(3) In top-level SQL, you create a cursor with the "declare" statement. This lets you name it, specify its “select”, and specify a few other boolean attributes like "[ no ] scroll" and "{ with | without } hold".

(4) In PL/pgSQL, you create a cursor with "open". The operand is the identifier for the refcursor variable that holds the cursor's name. You can test your mental model by using the equivalent SQL statements with the "execute" PL/SQL statement.

(5) In top-level SQL, you drop a cursor with "close" where the operand is the identifier for the cursor's name. In PL/pgSQL, you drop a cursor with "close" where the operand is the identifier for the refcursor variable that holds the cursor's name.

(6) In PL/pgSQL, the value of a refcursor variable is an ordinary "text" value. It might be null. It you assign the name of a cursor that's listed in "pg_cursors" to a refcursor variable, then you can fetch from it or close it. And as long as the name isn't currently found in "pg_cursors", you can create a new row with that name with the "open" statement, specifying any "select" that you want.

(7) I found the terms "bound cursor" and "unbound cursor" (as in the section  "43.7.2.3. Opening A Bound Cursor") initially very confusing because the wording connotes a property of a cursor—and "pg_cursors" has no column for such a notion. But I presently came to understand that this was a careless shorthand for "[un]bound cursor variable" — which phrases are also used on the same page.

(8) I found it initially hard to understand that the "bound" property of a refcursor variable is not part of its value. (And nor, for that matter, is the SQL statement that you specify with the "open" statement.) I reasoned, eventually, that the "bound" property must be an annotation of the variable in the AST for the block statement where the variable is declared. (Here, "declare" is used in the PL/pgSQL sense, and not the SQL sense, of the term). This explains why, when a function returns a refcursor value where the variable was declared as "bound", it can only be seen as "unbound" in a subprogram that has a refcursor formal argument. The same reasoning applies if you assign a bound refcursor variable to an unbound refcursor variable. (But I can't see that you'd have a reason to do that unless, like I was, you were testing your mental model.)

It's the fact that the value that a refcursor variable holds is nothing other than the text of a (potential) cursor's name (and that the SQL text and "bound" status are represented elsewhere) that lead me to write « without anything like Oracle PL/SQL's packages, you have no mechanism to hold the opened cursor variable between successive server calls ». When I wrote that, I thought, wrongly as I now see, that a refcursor variable held a composite, opaque value (or an opaque pointer to such) like it does in Oracle.

(9) The upshot of #8 is that the "FOR recordvar IN bound_cursorvar" construct can be used only in the block statement that declares the bound cursor variable. And this seems to defeat the point. You may just as well use an ordinary "for" loop that has the SQL statement right after the "in" keyword.

(10) I discovered that this construct:

for ... in select ... from pg_cursors order by name loop
   ...
end loop;

sees a cursor with an automatically generated name like "<unnamed portal N>" for the loop itself. I suppose that this makes good sense. But it does seem to undermine the value of declaring and using a bound cursor variable—esp as the nominal value of the "cursor" concept is the scrollability and the ability to fetch a smallish set of rows from anywhere in a huge result set.


Pretty much all of the above can be explained by:

https://www.postgresql.org/docs/current/sql-declare.html

"Note

This page describes usage of cursors at the SQL command level. If you are trying to use cursors inside a PL/pgSQL function, the rules are different —"

"The SQL standard only makes provisions for cursors in embedded SQL. The PostgreSQL server does not implement an OPEN statement for cursors; a cursor is considered to be open when it is declared. However, ECPG, the embedded SQL preprocessor for PostgreSQL, supports the standard SQL cursor conventions, including those involving DECLARE and OPEN statements."

https://www.postgresql.org/docs/current/plpgsql-cursors.html

"Rather than executing a whole query at once, it is possible to set up a cursor that encapsulates the query, and then read the query result a few rows at a time. One reason for doing this is to avoid memory overrun when the result contains a large number of rows. (However, PL/pgSQL users do not normally need to worry about that, since FOR loops automatically use a cursor internally to avoid memory problems.) A more interesting usage is to return a reference to a cursor that a function has created, allowing the caller to read the rows. This provides an efficient way to return large row sets from functions."


As to portal, entering it in the documentation search leads to a first result of:

https://www.postgresql.org/docs/current/protocol-flow.html

Do a page search for portal.




--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux