Search Postgresql Archives

Re: Guidance on INSERT RETURNING order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Tue, Apr 11, 2023, at 4:22 PM, Tom Lane wrote:
Federico <cfederico87@xxxxxxxxx> writes:
> Searching the archive seems that a using the INSERT SELECT ORDER BY
> form should be a better solution,
> so the above insert should be rewritten as

>     INSERT INTO t(data)
>     SELECT data FROM (VALUES ('a', 1), ('b', 2), ('c', 3)) as vv(data,
> num) ORDER BY num
>     RETURNING id

> to ensure that the id are created in the order specified by num. The
> returned id can again be in
> arbitrary order, but sorting them should enable correctly matching the
> orm object so that they can
> be properly updated.
>     Is this correct?

No.  Sadly, adding that ORDER BY is just voodoo programming, because
it applies to the result of the SELECT while promising nothing about
the order in which INSERT/RETURNING will act on those rows.

Re-reading that 2012 thread, the main new observation I'd make today
is that parallel operation is a thing now, and it's not hard to foresee
that sometime soon we'll want to parallelize INSERTs.  Which'd make it
*really* hard to promise anything about the order of RETURNING output.

if I can state this without having RETURNING getting in the way, because we know RETURNING is not ordered.

Start with this table:

CREATE TABLE mytable (
   id SERIAL PRIMARY KEY,
   a INT,
   b INT
)

Then insert two rows, where id SERIAL fires implicitly, assume the next value the sequence will give us is N1, and then the value after that is N2.  It doesn't matter what N1 and N2 are (don't need to be consecutive) but we want N2 > N1, that is, increasing.

INSERT INTO mytable (a, b) SELECT p1, p2 FROM (VALUES (10, 11, 1), (12, 13, 2)) AS sel(p1, p2, num) ORDER BY num

Then SELECT with ORDER BY:

SELECT id, a, b FROM mytable ORDER BY id

We want the results to be:

(N1, 10, 11)
(N2, 12, 13)


and we dont want them to *ever* be:

(N1, 12, 13)
(N2, 10, 11)


that is, we want the SERIAL column (or an IDENTITY also) to be lined up with the VALUES.

From what you wrote in https://www.postgresql.org/message-id/29386.1528813619%40sss.pgh.pa.us , that seems to be exactly what you've stated, where this statement:

INSERT INTO table (a, b) SELECT p1, p2 FROM (VALUES (10, 11, 1), (12, 13, 2)) AS sel(p1, p2, num) ORDER BY num

is organized by the query planner to essentially be equivalent to this, where the nextval() is part of the SELECTed data:

INSERT INTO mytable (id, a, b) SELECT nextval('mytable_id_seq'), p1, p2 FROM
(SELECT p1, p2 FROM (VALUES (10, 11, 1), (12, 13, 2)) AS sel(p1, p2, num) ORDER BY num) as _x

in practice, we add "RETURNING id" and expect those "id" cols to be in increasing order, so we sort to match it up with the input rows.





I think if you want to use RETURNING with multi-row inserts, the
thing to do is more like

  INSERT INTO t(data) VALUES ('a'), ('b'), ('c') RETURNING data, id

and then explicitly match up the returned "data" values rather than
presuming they appear in the same order you wrote them in in VALUES.

we're going to do that also when the table has something like a uuid for a primary key or otherwise.


Admittedly this might be problematic if some of the VALUES rows
are identical, but how much should you care?

we only do any of this if the rows have something unique in them we can hook onto.



regards, tom lane


thanks so much for replying!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux