2009/10/14 Scott Marlowe <scott.marlowe@xxxxxxxxx>
On Wed, Oct 14, 2009 at 1:20 AM, Pavel Stehule <pavel.stehule@xxxxxxxxx> wrote:Exactly. If you're running that query over and over your "performance
> 2009/10/14 Thom Brown <thombrown@xxxxxxxxx>:
>> 2009/10/14 Scott Marlowe <scott.marlowe@xxxxxxxxx>:
>> Why not just do something like:
>>
>> SELECT thisfield, thatfield
>> FROM my_table
>> WHERE thisfield IS NOT NULL
>> ORDER BY RANDOM()
>> LIMIT 1;
>>
>
> this works well on small tables. On large tables this query is extremely slow.
test" is on how well pgsql can run that very query. :) Anything else
you do is likely to be noise by comparison.
What I am using often to get a set of random rows is
SELECT thisfield, thatfield
FROM my_table
WHERE random() < rowsneeded::float8/(select count * from my_table);
Of course it does not give exact number of rows, but close enough for me.
As of taking one row I'd try:
select * from (
SELECT thisfield, thatfield
FROM my_table
WHERE random() < 100.0/(select count * from my_table))
a order by random() limit 1
I'd say probability of returning no rows is quite low and query can be extended even more by returning first row from table in this rare case.