Ok thanks for this precision Merlin.
Seems like aggregates are way more powerful than I thought.
Obviously I need a lot more reading about custom aggregates before fully understanding it.
Elliot's query is pure SQL so obviously very cool !
Seems like aggregates are way more powerful than I thought.
Obviously I need a lot more reading about custom aggregates before fully understanding it.
Elliot's query is pure SQL so obviously very cool !
It could be improved at the margin, and aggregates/function are certainly faster on big data.
But if you have no specific needs I would say Elliot is easier and more universal.
Cheers & thanks all for this good discussion.
Rémi-C
2013/10/23 Merlin Moncure <mmoncure@xxxxxxxxx>
> 2013/10/22 Merlin Moncure <mmoncure@xxxxxxxxx>
>> > With a standard loop, I loop n times, and each times I only need the
>> > current
>> > row plus the previous row which I put in memory, thus O(n).
>>
>> For posterity, the above is incorrect. Since the aggregate is ordered
>> through the window function, it gets executed exactly once per output
>> row. It behaves exactly like a loop. You know this because there is
>> no array in the aggregate state.
>>
> just out of pure curiosity,It is always the case. Generally speaking, aggregates, especially
> is it always the case or is it due to this particular aggregate?
user defined aggregates, are run once per input row. In this case
the main utility of window functions is to order the aggregate
execution calls and (especially) allow intermediate output per input
row, instead of per aggregate grouping.
For this *specific* problem, I would give Elliot's (extremely clever)
On Tue, Oct 22, 2013 at 6:01 PM, Robert James <srobertjames@xxxxxxxxx> wrote:
> Wow, this is an excellent discussion - and I must admit, a bit beyond
> my abilities. Is there a consensus as to the best approach to adopt?
> Is Elliot's the best?
query the nod on the basis that it does not require any supporting
infrastructure, which is always nice. That being said, once you start
getting the mojo of user defined aggregates + window functions it
starts to become clear that it's a cleaner way of doing many types of
things that are normally handled by loops.
merlin