Re: Are multiple array_aggs going to be in the same order?

Paul Jungwirth <pj@xxxxxxxxxxxxxxxxxxxxxxxx> · Mon, 10 Apr 2017 10:10:11 -0700

On 04/10/2017 09:33 AM, Merlin Moncure wrote:
On Sun, Apr 9, 2017 at 4:27 PM, Guyren Howe <guyren@xxxxxxxxx> wrote:
If I do a SELECT with a GROUP_BY and multiple ARRAY_AGGs,
>> will the ARRAY_AGGs be guaranteed to have entries in the
>> same (ie corresponding) order?
It is unsafe to rely on aggregation order unless specified --
> you can add ORDER BY to the aggregation clause.

Just thought I'd add that if you do this:

    SELECT  u.name,
            ARRAY_AGG(o.order_date ORDER BY o.id) AS order_dates,
            ARRAY_AGG(o.order_total ORDER BY o.id) AS order_totals
    ...

Then you can check EXPLAIN and should see that Postgres is only doing 
one sort, not two, so there is no performance cost. (Of course for more 
complicated queries you might want to double check what EXPLAIN intends 
to do.)

But something I've wondered myself is how well Postgres handles sorting 
already-mostly-sorted lists? I tried diving into the code, starting from 
here:

https://doxygen.postgresql.org/tuplesort_8c_source.html

but I couldn't answer the question myself. The comments say that as long 
as the tuples all fit in work_mem, it uses qsort, but the code appears 
to call qsort_ssup or qsort_tuple, whose definitions I couldn't find. (I 
looks like they are from qsort_tuple.c which is generated by a Perl 
script.) I know that qsort(3) is not necessarily quicksort, despite the 
name. Does anyone know what algorithm Postgres uses? It seems like the 
mostly-already-sorted case would happen a lot, so I'm curious if 
Postgres pays a performance cost there?

Thanks,
Paul

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general