bryn@xxxxxxxxxxxx wrote: www.postgresql.org/message-id/20220618064453.wtz4hxyeptwdh37z%40hjp.at ended up in my junk folder. (This happens often, but randomly, with emails sent to pgsql-general—and I can't control it.) I read it quickly. And then when I returned to read it more carefully I managed to delete it—irrevocably. The discussion has wandered so far from the original topic that it seemed just as well to start a new thread with this.
It seems to me that my new tescase reveals one clear bug and one outcome that seems to me to be a bug. Others might argue that neither is a bug. Of course, I'll be happy to be corrected—especially if I did some typos in my SQL or misinterpreted what I saw. ———————————————————————————————————————————————————————————————————————————————— Thank you for that example. It stimulated me to think harder than I had before and to design a brand new testcase. I didn't need to refer to your example when I wrote what follows. I'm using the term "view" here as a shorthand for « the results from a "select" ». And I used real views in my testcase to make the programming easier. I copied my self-contained testcase below so that you can simply run it "as is". (You'll have to add the "drop" statements that you need.) Here's my interpretation of the testcase output: Self-evidently, a view does *not* inherit constraints from the columns of its base table(s). A view on a single table doesn't necessarily inherit the data types of its base table's columns. Rather, the view compilation's analysis is *sometimes* clever enough to notice when a projected column might have a NULL even when the base column doesn't allow NULLs. In this case, if the base column's data type is (in my example) the domain "text_nn", then the corresponding column in the view is given the data type plain "text". My test that uses a single table shows this. However, the compilation's analysis for a view on a join (at least when it's a two-table "outer join") slavishly inherits the data types from all of the referenced columns—even when the human can easily predict that some projected columns might have NULLs. It seems to me that this optimistic design choice was unfortunate—and that a pessimistic choice would have been better: — when the analysis cannot predict the outcome, replace the data type of *every* column that has a "not null domain" data type with the domain's base data type. But I accept that this cannot be changed now. Might it be possible (in some future PG release) to make the analysis clever enough to deal with the issue at hand (as it already does in my single-table example)? With "insert-select", you (self-evidently) define the data types and constraints of the target table's columns explicitly, reflecting your analysis of what you expect. Of course, then, the "insert-select" must respect the target table's definition—and there's no paradox. (But see "Finally" below for the "insert-select from self" case.) But with CTAS, the target table mechanically inherits the data types that the view's columns have been given. Moreover (and this hugely surprises me), the mechanics of CTAS do not respect the constraint that my "text_nn" brings, for some column(s), but somehow manage to "tunnel under" this but even so give the resulting table's corresponding column the "text_nn" data type. What is the reasoning that led to implementing this behavior? (Notice that any subsequent insert into the table (even using "insert-select") *does* respect the constrains that "not null domains" bring—except for the caveat that I expose in "Finally". This is the paradox that we've been discussing. I had expected errors where I now see the "tunneling under". I realize now that I had never tested this explicitly. It would be foolish, therefore, to define the target table for "insert-select" using "CTAS where false". B.t.w., and orthogonal to the point here, with CTAS you always have to alter the resulting table manually after the fact to add constraints. But with "insert-select", you can choose whether to define the constraints as part of "create table" or, using "alter table", after the "insert". If you want the famous advantages of CTAS over "insert-select", you can easily decorate the defining "select list" with appropriate typecasts. Finally, I show that a subsequent "insert" into the table that (automatically) has been given "text_nn" for some column *does* respect the constraint that this brings: (1) When it defines the to-be-inserted data explicitly with "values". (2) When the "insert-select" source is a different table from the target. However, the "text_nn" constraint is *not* respected here: (3) When the "insert-select" source is the source table itself. ———————————————————————————————————————————————————————————————————————————————— \pset null '~~' /* HERE'S WHAT I SEE AT THE psql PROMPT; Table "u1.st_target" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- k | integer | | | v1 | text_nn | | | v2 | text | | | Table "u1.oj_target" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- k | integer | | | v1 | text_nn | | | v2 | text_nn | | | psql:testcase.sql:108: ERROR: domain text_nn does not allow null values psql:testcase.sql:116: ERROR: domain text_nn does not allow null values k | v1 | v2 -----+--------+--------- 10 | book-1 | genre-1 20 | book-2 | genre-1 30 | book-3 | genre-1 40 | book-4 | genre-2 50 | book-5 | ~~ 110 | book-1 | genre-1 120 | book-2 | genre-1 130 | book-3 | genre-1 140 | book-4 | genre-2 150 | book-5 | ~~ */; -- Single table create domain text_nn as text not null; create table st_source_tab ( k int primary key, v1 text_nn, v2 text_nn); insert into st_source_tab(k, v1, v2) values (10, 'book-1', 'genre-1'), (20, 'book-2', 'genre-1'), (30, 'book-3', 'genre-1'), (40, 'book-4', 'genre-2'), (50, 'book-5', ''); create view st_source_view(k, v1, v2) as select k, v1, case when k = 50 then null else v2 end case from st_source_tab; create table st_target as select k, v1, v2 from st_source_view; -- \d st_source_view -- Same as "st_source_view", of course -- Notice that "v2" is plain "text". But its source column is "text_nn". \d st_target ---------------------------------------------------------------------- -- Outer join create table genres( pk int primary key, genre text_nn); insert into genres(pk, genre) values (1, 'genre-1'), (2, 'genre-2'), (3, 'genre-3'); create table books( pk int primary key, title text_nn, genre_pk int references genres(pk)); insert into books(pk, title, genre_pk) values (10, 'book-1', 1), (20, 'book-2', 1), (30, 'book-3', 1), (40, 'book-4', 2), (50, 'book-5', null); -- Make the column names use-case agnostic and match those of st_source_view. create view oj_source_view(k, v1, v2) as select b.pk, b.title, g.genre from books b left outer join genres g on b.genre_pk = g.pk; create table oj_target as select k, v1, v2 from oj_source_view; -- \d oj_source_view -- Same as "oj_source_view", of course -- Notice that "v2" is plain "text_nn", just as has been discussed. \d oj_target ---------------------------------------------------------------------- -- Show that the contents of tables "st_target" and "oj_target" -- are identical. Unsurprising. Just a sanity check. do $body$ declare differ constant boolean := ( with a as (select * from st_target except select * from oj_target), b as (select * from oj_target except select * from st_target) select (exists(select 1 from a) or exists(select 1 from b)) ); begin assert not differ, '"j_books" versus "r_books_j_view" test failed'; end; $body$; ---------------------------------------------------------------------- -- Finally: A BRAND NEW PARADOX. -- Case 1: "insert" using explicit "values()" clause. -- Fails with error 23502: -- domain text_nn does not allow null values insert into oj_target(k, v1, v2) values (99, 'book-99', null); -- Case 2: -- "insert-select" from non-self. -- Fails with error 23502: -- domain text_nn does not allow null values insert into oj_target select (k + 200), v1, v2 from st_target; -- Case 3: -- "insert-select" from self. -- SILENTLY SUCCEEDS! insert into oj_target select (k + 100), v1, v2 from oj_target; -- End result. select k, v1, v2 from oj_target order by k; |