On Wed, Feb 1, 2012 at 11:19 AM, Alessandro Gagliardi <alessandro@xxxxxxxx> wrote: > Final update on this thread: since it is only necessary for me to get a > rough ratio of the distribution (and not the absolute count), I refactored > the query to include a subquery that samples from the moments table > thus: SELECT moment_id, block_id FROM moments WHERE inserted BETWEEN > 'yesterday' AND 'today' ORDER BY RANDOM() LIMIT 10000; I also took advantage > of another table called blocks that happens to contain the moment_type as > well (thus making it so I don't need to reference pg_class). The final query > looks like: > > SELECT moment_type, emotion, COUNT(feedback_id) > FROM (SELECT moment_id, block_id > FROM moments > WHERE inserted BETWEEN 'yesterday' AND 'today' > ORDER BY RANDOM() LIMIT 10000) AS sample_moments > JOIN blocks USING (block_id) > JOIN emotions USING (moment_id) > GROUP BY moment_type, emotion > ORDER BY moment_type, emotion > > The explain is at http://explain.depesz.com/s/lYh > > Interestingly, increasing the limit does not seem to increase the runtime in > a linear fashion. When I run it with a limit of 60000 I get a runtime > of 14991 ms. But if I run it with a limit of 70000 I get a runtime of 77744 > ms. I assume that that's because I'm hitting a memory limit and paging out. > Is that right? Hard to say. more likely your query plan changes at that point. Run the queries with "explain analyze" in front of them to find out. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance