ORDER BY performance deteriorates very quickly as dataset grows

"Standa K." <standa.kurik@xxxxxxxxx> · Tue, 5 Nov 2013 09:36:21 +0100

I have a model like this:
http://i.stack.imgur.com/qCZpD.png

with approximately these table sizes

JOB: 8k
DOCUMENT: 150k
TRANSLATION_UNIT: 14,5m
TRANSLATION: 18,3m

Now the following query takes about 90 seconds to finish.

select translation.id
from "TRANSLATION" translation

   inner join "TRANSLATION_UNIT" unit
     on translation.fk_id_translation_unit = unit.id
   inner join "DOCUMENT" document
     on unit.fk_id_document = document.id     
where document.fk_id_job = 11698
order by translation.id asc
limit 50 offset 0

Query plan: http://explain.depesz.com/s/xlR

With the following modification, the time is reduced to 20-30 seconds (query plan)

with CTE as (
     select tr.id
     from "TRANSLATION" tr
          inner join "TRANSLATION_UNIT" unit
            on tr.fk_id_translation_unit = unit.id
          inner join "DOCUMENT" doc
            on unit.fk_id_document = doc.id     
     where doc.fk_id_job = 11698)
select * from CTE
order by id asc
limit 50 offset 0;

There are about 212,000 records satisfying the query's criteria. When I change 11698 to another id in the query so that there are now cca 40,000 matching records, the queries take 40ms and 55ms, respectively. The query plans also change: the original query, the CTE variant.

Is it normal to experience 2100× increase in the execution time (or cca 450× for the CTE variant) when the number of matching records grows just 5 times?

I ran ANALYZE on all tables just before executing the queries. Indexes are on all columns involved.

System info:

PostgreSQL 9.2

shared_buffers = 2048MB
effective_cache_size = 4096MB
work_mem = 32MB

Total memory: 32GB
CPU: Intel Xeon X3470 @ 2.93 GHz, 8MB cache

Thank you.