Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL

Bryn Llewellyn <bryn@xxxxxxxxxxxx> · Thu, 23 Dec 2021 13:05:00 -0800

pavel.stehule@xxxxxxxxx wrote:

bryn@xxxxxxxxxxxx wrote:

Yes, I did read your “schema variables” post on your site “Pavel Stehule’s blog — Some notes about PostgreSQL”. It gives me a very good idea about what you have in mind.

But as I’ve come to understand the term “Functional Spec”, this denotes a formal deliverable that a product development (sub)group owns and maintains collectively. Especially, it has a status which, eventually, is set to “Approved”. And apart from minor bug fixes, no code makes its way into a released version of the (sub)product in question unless the group has approved its Functional Spec.

Is this model not used for the PostgreSQL system?

Development model in Postgres is not too formal, the specification can be changed until the last moment, the main word has commetter with an agreement of the author of patch and all other people. The development of Postgres is much more agile than waterfall.

bryn continued:

Your blogpost and all the comments are dated Feb 2018—so almost four years ago now. What needs to happen for your project to be exposed in a PG Release? And which release might this be?

Technically, this patch is not too complex, but a) it was harder to find cleaner with good performance implementation in architecture (the current architecture knows queries and utility commands, and LET is something between). b) it was harder to find an agreement about specification because global temporal objects like global temporary tables or session variables are not in Postgres today (there is not long experience with this feature). Unfortunately 1. the related part of standard SQL/PSM is not widely accepted and this part of the standard is almost dead , 2. the introduced modules are +/- Postgres's schema, so it is a really redundant concept, 3. SET command (in SQL/PSM (ANSI/SQL) is used for different purpose in Postgres, and I had to use keyword LET (there is not possibility to use keyword SET without compatibility break), c) there was long discussion if variables should be transactional or non transactional (I strongly support not transactional by default - like any other databases does, and transactional behavior will be optional in next step).  

I invite any help with code and documentation review (and support in discussion) - I am not native speaker, and my English is very poor. I hope there is a change to commit this patch in Postgres 15. But it depends on commiter's capacite - and there are a lot of patches in the queue.

You can watch the progress of this work on pgsql-hackers mailing list

older https://www.postgresql.org/message-id/flat/CAFj8pRDY+m9OOxfO10R7J0PAkCCauM-TweaTrdsrsLGMb1VbEQ@xxxxxxxxxxxxxx

current https://www.postgresql.org/message-id/flat/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=K50LUAOwjC3jm8Me9Obg@xxxxxxxxxxxxxx

bryn continued: Back to the substance of your proposal, 

1. I see that a schema variable will be another kind of object in the class that has tables, views, sequences, indexes, UDTs, procedures, functions, and so on. So it’s persisted within a database; it’s owned by a user; and it’s localized in a schema. Does it have its own namespace too? For example, can a table x, a function x(), and a schema variable x all live in the same schema s in some database. And can I use the normal qualified name syntax to access a schema variable?

variables have their own namespace, because they have their own catalog table. It is designed like any catalog object - so you can use (or you don't need) to use a qualified identifier. But there is not any schema scope in Postgres now. Anything depends on SEARCH_PATH setting. So variables can be in the same schema with tables and functions (the access rules are the same).

bryn continued: 2. It seems, then, that all the familiar notions, and especially the ability for a non-owner to access it apply. I see that the SELECT privilege governs who can read the value of a schema variable. But there seems to be no mechanism that governs who can change the value of a schema variable. It looks like only the owner can change the value—using the new LET SQL statement. Does this mean both that a top-level call from a client session that’s authorized as the owner can change it and a “security definer” subprogram with the same owner can change it—but that there is no explicit (say, WRITE) privilege for this. Is this what you intend? If so, why not have a WRITE privilege?

In the last patch I renamed schema variables to session variables (on the community request). You can grant READ or WRITE rights to other users by command GRANT, or you can write a security owner function, if you need more precious work with sensitive data.

bryn continued: 4. You said “security definer function”. Is this what you mean? Or do you mean ANY function or procedure as long as the current user (i.e. the owner of the most tightly enclosing security definer unit) is the schema variable’s owner?

inside the security owner function you are running under functions's owner identity. It can have the same identity as the variable's owner. It is common pattern in Postgres (not just in Postgres)

bryn continued: 5. Could you please consider allowing a CONSTANT schema variable (with the usual syntax and requirement for an initialization _expression_)? One very popular use of a spec-level package variable is for a universal constant that has to be accessed in several places—like, for example, the conversion factor between some metric unit and some imperial unit.

There is it - it is not named CONSTANT, but IMMUTABLE (because CONSTANT needs to introduction of new SQL keyword, and IMMUTABLE is already used keyword (new keywords can introduce some compatibility issues))

CREATE IMMUTABLE VARIABLE iv AS int DEFAULT 100;
-- should to fail
LET iv = 10000;
ERROR:  session variable "iv" is declared IMMUTABLE

bryn continued: 3. What is the precedence scheme? For example, if a SQL statement in a PL/pgSQL unit has a restriction like this:

   …where col = x…

and x is both the name of an in-scope variable (or formal parameter) in the unit and the name of a schema variable? When the table has a column called x, then there’s (at least sometimes) no way round a run-time collision error except to rename one of the X’s. (Qualifying the names doesn’t fix it.) Will it be the same with schema variables? The question extends to ordinary assignment statements that become SQL statements under the covers:

  v := a + b;

where b happens to be both an in-scope variable and a schema variable.

You can handle collisions by using qualified identifiers. For PL/pgSQL you can use block labels, for other objects schema.

From PostgreSQL’s SQL perspective the session variables are common database objects (contra PL/SQL where package variables are PL/SQL language objects), and SQL disallows ambiguity. This is a little bit more complex problem, because session variables can be used everywhere in Postgres (not just in PL/pgSQL).

PL/pgSQL doesn’t see session variables like something special - PL/pgSQL runtime doesn’t see session variables ever (the work with session variables are done one level deeper) (the usage of session variables are fully transparent for this environment) so there are the precedence rules without change, but you can use with session variables inside PL/pgSQL (but from implementation perspective it is just any other SQL object).

https://www.postgresql.org/docs/9.6/plpgsql-implementation.html#PLPGSQL-VAR-SUBST

and you can set different precedence 

#variable_conflict error
#variable_conflict use_variable
#variable_conflict use_column

but I strongly don't advise it. From my perspective, allowing collisions was a significant Oracle's PL/SQL design error.

Thank you very much indeed for this careful reply, Pavel—and for the links to the threads on the Hackers list. A great deal is now clarified for me. You said “I am not native speaker, and my English is very poor”. You’re far, far, too modest. I am a native English speaker. And I often see writing from other native English speakers that’s pretty awful. Your writing is a lot better than theirs. Of course, I don’t know a single word of Czech.

I’m glad to see that the non-transactional behavior is still part of the plan—even if, later, a transactional variant is added. I’m glad to see, too, that READ/WRITE privileges and IMMUTABLE are now part of the plan.

Just a detail, now. You sad this:

variables have their own namespace, because they have their own catalog table. It is designed like any catalog object - so you can use (or you don't need) to use a qualified identifier. But there is not any schema scope 

I probably misused the phrase “schema scope”. I meant only to say that, if two schemas, s1 and s2, exist in the same database, then it’s OK to have two session variables called s1.x and s2.x. And you confirmed that this is the case. It seems that you can also have a session variable called s1.x and a table called s1.x. In a separate reply to my email, david.g.johnston@xxxxxxxxx thought that this would be disallowed because session variables would share the same relations namespace that tables and views already do. (I have no opinion here.) Thanks for the clarification.

All is clear about collisions and name qualification now. And, yes: I agree with your advice to leave those settings that you mentioned at their shipped defaults and to go with the native behavior (collisions simply cause a run-time error). Then to fix such errors by ordinary spelling changes (esp. schema qualification) in one’s code.

You said “I invite any help with code and documentation review”. I’m not a C coder. But I’d be happy to help with documentation review when the time comes.

Finally, I’m well advanced with my promised mini-project to prepare and explain a realistic use-case to show the benefits of a package construct in PL/pgSQL. I’ll wait until after Xmas before I share it with the General list. Having said this, I already appreciate that the chances of bringing packages to some future PG release are vanishingly small.