Search Postgresql Archives

Re: Proposal to introduce a shuffle function to intarray extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 16.07.22 um 18:53 schrieb Mladen Gogala:
On 7/15/22 04:36, Martin Kalcher wrote:
Dear list,

i am dealing with an application that processes fairly large arrays of
integers. It makes heavy use of the intarray extension, which works
great in most cases. However, there are two requirements that cannot
be addressed by the extension and are rather slow with plain SQL. Both
can be met with shuffling:

- Taking n random members from an integer array
- Splitting an array into n chunks, where each member is assigned to a
random chunk

Shuffling is currently implemented by unnesting the array, ordering
the members by random() and aggregating them again.


Martin, have you considered PL/Python and NumPy module?

Hey Mladen,

thank you for your advice. Unfortunately the performance of shuffling with NumPy is about the same as with SQL.

  create function numpy_shuffle(arr int[])
  returns int[]
  as $$
    import numpy
    numpy.random.shuffle(arr)
    return arr
  $$ language 'plpython3u';

  select arr[1:3]::text || ' ... ' || arr[3999998:4000000]::text
  from (
    select numpy_shuffle(arr) arr from numbers
  ) shuffled;

  -------------------------------------------------------
   {674026,3306457,1727170} ... {343875,3825484,1235246}

  Time: 2315.431 ms (00:02.315)

Am i doing something wrong?

Martin





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux