On Mon, Apr 10, 2017 at 12:01 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> Merlin Moncure <mmoncure@xxxxxxxxx> writes:
>> On Sun, Apr 9, 2017 at 4:27 PM, Guyren Howe <guyren@xxxxxxxxx> wrote:
>>> If I do a SELECT with a GROUP_BY and multiple ARRAY_AGGs, will the ARRAY_AGGs be guaranteed to have entries in the same (ie corresponding) order?
>>>
>>> eg
>>>
>>> SELECT
>>> u.name,
>>> ARRAY_AGG(o.order_date) AS order_dates,
>>> ARRAY_AGG(o.order_total) AS order_totals
>>> FROM
>>> user u JOIN
>>> orders o USING (user_id)
>>> GROUP BY
>>> u.user_id
>
>> It is unsafe to rely on aggregation order unless specified -- you can
>> add ORDER BY to the aggregation clause.
>
> You definitely can't assume anything about the order in which the FROM
> clause will deliver rows, but I think that's not quite what the question
> was. If I read it right, the OP wants to be sure that the two aggregate
> functions will see the data in the *same* unspecified order. I think
> that's a pretty safe assumption. The server would have to go way
> out of its way to do differently, and it doesn't.
Sure, but isn't it fair to consider that an implementation artifact?
If his code depends on that ordering being the same across aggregate
functions, and the SQL standard doesn't specify that (I guess it
might, but I'm skeptical), he ought to specify that for clarify at the
very least.
So, the presence of ORDER BY in the aggregate function call is a PostgreSQL extension...
It seems reasonable to declare that the order of the values in the generated array match whatever order the FROM clause supplies the rows. If that is not acceptable a PostgreSQL-specific ORDER BY modifier can be added which will cause an additional sort-and-scan of the input relation to occur (optimized across multiple column invocations when possible). Thus two aggregate functions w/o an ORDER BY will always see the source rows in the same order.
SELECT array_agg(v)
FROM (
SELECT * FROM (VALUES (3),(1),(2)) vals (v) ORDER BY 1
) AS src
The only real confusion is whether a query like the above is guaranteed to supply rows to the outer select target list in order. If that is indeed the case then the overall behavior seems quite reasonable to explicitly define like above.
Dave