On Mon, Oct 12, 2015 at 10:15 PM, Adrian Klaver <adrian.klaver@xxxxxxxxxxx> wrote:
On 10/12/2015 06:53 AM, Tom Lane wrote:
Andres Freund <andres@xxxxxxxxxxx> writes:
On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote:
CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
$$
BEGIN
RETURN QUERY SELECT * FROM users WHERE id = id_;
END;
$$ LANGUAGE plpgsql;
My guess is that the problem here is that table level locking prevents
modification of the "users" type when the table is used, but there's no
locking preventing the columns to be dropped while the function is
used. So what happens is that 1) the function is parsed & planned 2)
DROP COLUMN is executed 3) the contained statement is executed 4) a
mismatch between the contained statement and the function definition is
detected.
The query plan as such does get refreshed, I believe. The problem is that
plpgsql has no provision for the definition of a named composite type to
change after a function's been parsed. This applies to variables of named
composite types for sure, and based on this example I think it must apply
to the function result type as well, though I'm too lazy to go check the
code right now.
That makes sense. The problem is that I cannot square that with Albe's example, which I tested also:
"
Session 1:
test=> CREATE TABLE users (id integer PRIMARY KEY, name varchar NOT NULL, to_be_removed integer NOT NULL);
CREATE TABLE
test=> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
$$BEGIN RETURN QUERY SELECT * FROM users WHERE id = id_; END;$$ LANGUAGE plpgsql;
CREATE FUNCTION
Session 2:
test=> SELECT id, name FROM select_users(18);
id | name
----+------
(0 rows)
Ok, now the plan is cached.
Now in Session 1:
test=> ALTER TABLE users DROP COLUMN to_be_removed;
ALTER TABLE
Session2:
test=> SELECT id, name FROM select_users(18);
id | name
----+------
(0 rows)
No error. This is 9.4.4.
"
We have had past discussions about fixing this. I believe it would
require getting rid of use of plpgsql's "row" infrastructure for named
composites, at least in most cases, and going over to the "record"
infrastructure instead. In the past the conversations have stalled as
soon as somebody complained that that would probably make some operations
slower. I don't entirely understand that objection, since (a) some other
operations would probably get faster, and (b) performance does not trump
correctness. But that's where the discussion stands at the moment.
regards, tom lane
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx
In case any of you are interested of recreating this problem, I today had the time to create a short example that reproduce the error every time I try.
1. Create table and function
create table a(
id serial primary key,
x integer
);
create or replace function select_a() returns setof a AS
$$
begin
return query
select a.* from a;
end;
$$ language plpgsql;
id serial primary key,
x integer
);
create or replace function select_a() returns setof a AS
$$
begin
return query
select a.* from a;
end;
$$ language plpgsql;
2. Create loop_alter.sql with this content
#!/usr/bin/env bash
for i in {0..1000}; do
echo "alter table a add column y text; alter table a drop column y;"
done;
for i in {0..1000}; do
echo "alter table a add column y text; alter table a drop column y;"
done;
3. Create loop_select.sql with this content
#!/usr/bin/env bash
for i in {0..100000} do
echo "select * from select_a() limit 1;"
done;
#!/usr/bin/env bash
for i in {0..100000} do
echo "select * from select_a() limit 1;"
done;
4. Run the files from 2 and 3 simultaneous with psql:
In one terminal: ./loop_alter.sql | psql
In another: ./loop_select.sql | psql
(Note that you need to drop and recreate the table after each run since it reaches the column limit otherwise)
/Victor