Outer joins and NULLs (old subject "ERROR: failed to find conversion function from key_vals_nn to record[]")

Bryn Llewellyn <bryn@xxxxxxxxxxxx> · Sun, 19 Jun 2022 14:31:11 -0700

bryn@xxxxxxxxxxxx wrote:

david.g.johnston@xxxxxxxxx wrote:

bryn@xxxxxxxxxxxx wrote:

Can anybody show me an implementation of a realistic use case that follows proper practice — like "every table must a primary key", "a foreign key must refer to a primary key", and "joins may be made only "on" columns one of which has a PK constraint and the other of which has a FK constraint" — where using a not nullable data type brings a problem that wouldn't occur if the column were defined with a nullable data type and an explicit "not null" constraint?

Nothing obvious comes to mind. But frankly:
proper practice includes trying to write idiomatic code for the language you are using so others familiar with the language can learn your code more easily. You are violating this to an extreme degree.

I do not think it to be a good trade-off. SQL writers are practical people and the idioms largely avoid any downsides that the arise from SQL not being some paragon of language design.

-- "\d genres" shows "gk" with a "not null" constraint, whether I write it
-- or not. And convention seems to say "don't clutter you code by writing it".
create table genres(
  gk  int   primary key,
  gv  text  not null
  );

"Primary Key" is defined to be the application of both UNIQUE and NOT NULL constraints...

Yes, I know what "primary key" implies. I meant only to emphasize that the source column for what the "outer join" projects has a not null constraint, that it doesn't apply to the projection of that column, that this is perfectly understandable, and that this isn't a problem. Never mind.

Peter, your email:

www.postgresql.org/message-id/20220618064453.wtz4hxyeptwdh37z%40hjp.at

ended up in my junk folder. (This happens often, but randomly, with emails sent to pgsql-general—and I can't control it.) I read it quickly. And then when I returned to read it more carefully I managed to delete it—irrevocably.

The discussion has wandered so far from the original topic that it seemed just as well to start a new thread with this.

hjp-pgsql@xxxxxx wrote:

But it would be a problem if there was an actual type which wouldn't include NULL.

The NOT NULL attribute is an attribute of the column, not the type. When you use the primary key (or any other column marked as NOT NULL) the type of the result is just the type of that column, the NOT NULL is dropped.

It seems to me that my new tescase reveals one clear bug and one outcome that seems to me to be a bug. Others might argue that neither is a bug. Of course, I'll be happy to be corrected—especially if I did some typos in my SQL or misinterpreted what I saw.

————————————————————————————————————————————————————————————————————————————————

Thank you for that example. It stimulated me to think harder than I had before and to design a brand new testcase. I didn't need to refer to your example when I wrote what follows.

I'm using the term "view" here as a shorthand for « the results from a "select" ». And I used real views in my testcase to make the programming easier.

I copied my self-contained testcase below so that you can simply run it "as is". (You'll have to add the "drop" statements that you need.)

Here's my interpretation of the testcase output:

Self-evidently, a view does *not* inherit constraints from the columns of its base table(s).

A view on a single table doesn't necessarily inherit the data types of its base table's columns. Rather, the view compilation's analysis is *sometimes* clever enough to notice when a projected column might have a NULL even when the base column doesn't allow NULLs. In this case, if the base column's data type is (in my example) the domain "text_nn", then the corresponding column in the view is given the data type plain "text". My test that uses a single table shows this.

However, the compilation's analysis for a view on a join (at least when it's a two-table "outer join") slavishly inherits the data types from all of the referenced columns—even when the human can easily predict that some projected columns might have NULLs.

It seems to me that this optimistic design choice was unfortunate—and that a pessimistic choice would have been better:

— when the analysis cannot predict the outcome, replace the data type of *every* column that has a "not null domain" data type with the domain's base data type.

But I accept that this cannot be changed now. Might it be possible (in some future PG release) to make the analysis clever enough to deal with the issue at hand (as it already does in my single-table example)?

With "insert-select", you (self-evidently) define the data types and constraints of the target table's columns explicitly, reflecting your analysis of what you expect. Of course, then, the "insert-select" must respect the target table's definition—and there's no paradox. (But see "Finally" below for the "insert-select from self" case.)

But with CTAS, the target table mechanically inherits the data types that the view's columns have been given. Moreover (and this hugely surprises me), the mechanics of CTAS do not respect the constraint that my "text_nn" brings, for some column(s), but somehow manage to "tunnel under" this but even so give the resulting table's corresponding column the "text_nn" data type.

What is the reasoning that led to implementing this behavior? (Notice that any subsequent insert into the table (even using "insert-select") *does* respect the constrains that "not null domains" bring—except for the caveat that I expose in "Finally".

This is the paradox that we've been discussing.

I had expected errors where I now see the "tunneling under". I realize now that I had never tested this explicitly.

It would be foolish, therefore, to define the target table for "insert-select" using "CTAS where false".

B.t.w., and orthogonal to the point here, with CTAS you always have to alter the resulting table manually after the fact to add constraints. But with "insert-select", you can choose whether to define the constraints as part of "create table" or, using "alter table", after the "insert".

If you want the famous advantages of CTAS over "insert-select", you can easily decorate the defining "select list" with appropriate typecasts.

Finally, 

I show that a subsequent "insert" into the table that (automatically) has been given "text_nn" for some column *does* respect the constraint that this brings:

(1) When it defines the to-be-inserted data explicitly with "values".

(2) When the "insert-select" source is a different table from the target.

However, the "text_nn" constraint is *not* respected here:

(3) When the "insert-select" source is the source table itself.

————————————————————————————————————————————————————————————————————————————————

\pset null '~~'

/*
  HERE'S WHAT I SEE AT THE psql PROMPT;

                 Table "u1.st_target"
   Column |  Type   | Collation | Nullable | Default 
  --------+---------+-----------+----------+---------
   k      | integer |           |          | 
   v1     | text_nn |           |          | 
   v2     | text    |           |          | 

                 Table "u1.oj_target"
   Column |  Type   | Collation | Nullable | Default 
  --------+---------+-----------+----------+---------
   k      | integer |           |          | 
   v1     | text_nn |           |          | 
   v2     | text_nn |           |          | 

  psql:testcase.sql:108: ERROR:  domain text_nn does not allow null values
  psql:testcase.sql:116: ERROR:  domain text_nn does not allow null values
    k  |   v1   |   v2    
  -----+--------+---------
    10 | book-1 | genre-1
    20 | book-2 | genre-1
    30 | book-3 | genre-1
    40 | book-4 | genre-2
    50 | book-5 | ~~
   110 | book-1 | genre-1
   120 | book-2 | genre-1
   130 | book-3 | genre-1
   140 | book-4 | genre-2
   150 | book-5 | ~~
*/;

-- Single table
create domain text_nn as text not null;

create table st_source_tab (
  k   int primary key,
  v1  text_nn,
  v2  text_nn);

insert into st_source_tab(k, v1, v2) values
 (10, 'book-1', 'genre-1'),
 (20, 'book-2', 'genre-1'),
 (30, 'book-3', 'genre-1'),
 (40, 'book-4', 'genre-2'),
 (50, 'book-5', '');

create view st_source_view(k, v1, v2) as
select
  k,
  v1,
  case
    when k = 50 then null
    else        v2
  end case
from st_source_tab;

create table st_target as
select k, v1, v2
from st_source_view;

-- \d st_source_view
-- Same as "st_source_view", of course
-- Notice that "v2" is plain "text". But its source column is "text_nn".
\d st_target

----------------------------------------------------------------------
-- Outer join

create table genres(
  pk     int primary key,
  genre  text_nn);

insert into genres(pk, genre) values
  (1, 'genre-1'),
  (2, 'genre-2'),
  (3, 'genre-3');

create table books(
  pk        int primary key,
  title     text_nn,
  genre_pk  int references genres(pk));

insert into books(pk, title, genre_pk) values
  (10, 'book-1', 1),
  (20, 'book-2', 1),
  (30, 'book-3', 1),
  (40, 'book-4', 2),
  (50, 'book-5', null);

-- Make the column names use-case agnostic and match those of st_source_view.
create view oj_source_view(k, v1, v2) as
select
  b.pk,
  b.title,
  g.genre
from
  books b
  left outer join
  genres g
  on b.genre_pk = g.pk;

create table oj_target as
select k, v1, v2
from oj_source_view;

-- \d oj_source_view
-- Same as "oj_source_view", of course
-- Notice that "v2" is plain "text_nn", just as has been discussed.
\d oj_target

----------------------------------------------------------------------
-- Show that the contents of tables "st_target" and "oj_target"
-- are identical. Unsurprising. Just a sanity check.
do $body$
declare
  differ constant boolean :=
    (
    with
      a as (select * from st_target except select * from oj_target),
      b as (select * from oj_target except select * from st_target)
    select (exists(select 1 from a) or exists(select 1 from b))
    );
begin
  assert not differ, '"j_books" versus "r_books_j_view" test failed';
end;
$body$;

----------------------------------------------------------------------
-- Finally: A BRAND NEW PARADOX.

-- Case 1: "insert" using explicit "values()" clause.
-- Fails with error 23502:
-- domain text_nn does not allow null values
insert into oj_target(k, v1, v2) values
 (99, 'book-99', null);

-- Case 2:
-- "insert-select" from non-self.
-- Fails with error 23502:
-- domain text_nn does not allow null values
insert into oj_target
select (k + 200), v1, v2
from st_target;

-- Case 3:
-- "insert-select" from self.
-- SILENTLY SUCCEEDS!
insert into oj_target
select (k + 100), v1, v2
from oj_target;

-- End result.
select k, v1, v2 from oj_target order by k;