Re: From Simple to Complex

Scott Marlowe <scott.marlowe@xxxxxxxxx> · Wed, 1 Feb 2012 11:35:31 -0700

On Wed, Feb 1, 2012 at 11:19 AM, Alessandro Gagliardi
<alessandro@xxxxxxxx> wrote:
> Final update on this thread: since it is only necessary for me to get a
> rough ratio of the distribution (and not the absolute count), I refactored
> the query to include a subquery that samples from the moments table
> thus: SELECT moment_id, block_id FROM moments WHERE inserted BETWEEN
> 'yesterday' AND 'today' ORDER BY RANDOM() LIMIT 10000; I also took advantage
> of another table called blocks that happens to contain the moment_type as
> well (thus making it so I don't need to reference pg_class). The final query
> looks like:
>
> SELECT moment_type, emotion, COUNT(feedback_id)
>   FROM (SELECT moment_id, block_id
>           FROM moments
>          WHERE inserted BETWEEN 'yesterday' AND 'today'
>          ORDER BY RANDOM() LIMIT 10000) AS sample_moments
>   JOIN blocks USING (block_id)
>   JOIN emotions USING (moment_id)
>  GROUP BY moment_type, emotion
>  ORDER BY moment_type, emotion
>
> The explain is at http://explain.depesz.com/s/lYh
>
> Interestingly, increasing the limit does not seem to increase the runtime in
> a linear fashion. When I run it with a limit of 60000 I get a runtime
> of 14991 ms. But if I run it with a limit of 70000 I get a runtime of 77744
> ms. I assume that that's because I'm hitting a memory limit and paging out.
> Is that right?

Hard to say.  more likely your query plan changes at that point.  Run
the queries with "explain analyze" in front of them to find out.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance