Tomas Vondra-4 wrote
> Please share explain plans for both the slow and the fast query. That
> makes it easier to spot the difference, and possibly identify the cause.
>
> Also, what PostgreSQL version is this, and what are "basic" config
> parameters (shared buffers, work mem)?
I am running 9.4.4, here are the basic config parameters:
work_mem = 32 MB
shared_buffers = 8196 MB
temp_buffers = 8 MB
effective_cache_size = 4 GB
I have run ANALYZE on all tables prior to running the queries. The query
plan for the fast version (without the WHERE clause) follows:
<http://postgresql.nabble.com/file/n5864075/qp2.png>
What I don't understand is the difference between the inner NESTED LOOP
between the slow and the fast query plan. In the fast one, both index scans
have 1000 as the actual row count. I would expect that, given the LIMIT
clause. The slow query plan, however, shows ~ 75 000 000 as the actual row
count. Is the extra WHERE condition the only and *plausible* explanation for
this difference?
In the slow query it requires evaluating every single document to determine which of the 75 million translations can be discarded; after which the first 1000 when sorted by translation id are returned.
In the first query the executor simply scans the translation index in ascending order and stops after retrieving the first 1,000.
What you are expecting, I think, is for that same process to continue beyond 1,000 should any of the first 1,000 be discarded due to the corresponding document not being updated recently enough, until 1,000 translations are identified. I'm not sure why the nested loop executor is not intelligent enough to do this...
The important number in these plans is "loops", not "rows"
David J.