Re: is there a way to deliver an array over column from a query window?

Misa Simic <misa.simic@xxxxxxxxx> · Thu, 25 Apr 2013 15:21:59 +0200

SELECT DISTINCT a, b, c, array_agg(k.d) OVER (PARTITION BY k.c )  FROM testy k where k.e <> 'email' and k.c='1035049'  ORDER BY a, b, c, e

If doesnt work - Probably there is a better option...

In worst case I would do

SELECT DISTINCT a, b, c, array_agg(d)  OVER (PARTITION BY c )  FROM

(

SELECT a, b, c, d FROM  testy where e <> 'email' and c='1035049'  ORDER BY  a, b, c, e

)

Kind Regards,

Misa

2013/4/24 Rafał Pietrak <rafal@xxxxxxxxxxxxxxxxxx>

    W dniu 03/24/2013 12:11 PM, Rafał
      Pietrak pisze:

      W dniu 03/24/2013 12:06 PM, Misa
        Simic pisze:

        maybe,

          SELECT

              DISTINCT issuer,amount, array_agg(REFERENCE) over
              (partition by invoice_nr) from invoices;

      RIGHT. Thenx. (and the first thing I did, I've read the doc on
      array_agg().... what stress makes from people :(

    Actually, I have a problem with that (which I haven't noticed
    earlier because the data I'm having, don't have to many "duplicates"
    that cause it). The problem is, that:

    --------------------------------------

    SELECT DISTINCT a, b, c, array_agg(k.d) OVER (PARTITION BY k.c ) 
    FROM testy k where k.e <> 'email' and k.c='1035049' ;

      a   |    b     |    c    |           array_agg           

    ------+----------+---------+-------------------------------

     1035 | 10410053 | 1035049 | {9902031328529,5951948640868}

    ---------------------------------------

    is _almost_ fine. But I actually need to have control over the order
    in which the array gathered its values. So I try:

    ------------------------------------

    SELECT DISTINCT a, b, c, array_agg(k.d) OVER (PARTITION BY k.c ORDER
    BY k.e)  FROM testy k where k.e <> 'email' and k.c='1035049' ;

      a   |    b     |    c    |           array_agg           

    ------+----------+---------+-------------------------------

     1035 | 10410053 | 1035049 | {5951948640868}

     1035 | 10410053 | 1035049 | {5951948640868,9902031328529}

    (2 rows)

    --------------------------------------

    And this is not at all what I've expected - the aggerate function
    returned different values over the selected partition.

    I understand, that this behavior (of changing the aggregate function
    return values) is there for the purpose of having sum() - and the
    like - aggregate functions return accumulating/averaged/etc values
    as of the example in postgres documentation (

    http://www.postgresql.org/docs/9.1/static/tutorial-window.html)

    But the array_agg() is significantly different from other aggregate
    functions - it maintains all the trasspassed values within; under
    such circumstances: is it reasonable to copy that functionality (of
    PARTITION OVER ... ORDER BY...) in it?

    A particular value relevant to a particular row (when SELECT
    withiout DISTINCT) can be retrieved by RANK() function used as an
    index into the resulting array.

    But, if (unfortunately) this functionality have to stay: Can
    somebody pls help me cooking an SQL that returns the same value of
    array_agg() over the entire partition, while letting me control the
    order of aggregated values, based on the order of column "E"?

    My table for the showcase was:

    -----------------------------------

    SELECT * FROM testy;

      a   |    b     |    c    |          d           |   e   

    ------+----------+---------+----------------------+-------

     1035 | 10410053 | 1035049 | 9902031328529        | tel

     1035 | 10410053 | 1035049 | 5291286807@xxxxxxxxx | email

     1035 | 10410053 | 1035049 | 5951948640868        | tel2

    (3 rows)

    ------------------------------------------

    thx

    -R