On Tue, Aug 20, 2019 at 10:22 AM Barbu Paul - Gheorghe <barbu.paul.gheorghe@xxxxxxxxx> wrote:
The query takes a long time to execute, specially at first, due to
cold caches I think, but the performance varies greatly during a run
of the application (while applying the said action by the user several
times).
Yes, it certainly looks like it is due to cold caches. But you say it is slow at first, and then say it varies greatly during a run. Is being slow at first the only way it varies greatly, or is there large variation even beyond that?
You can use pg_rewarm to overcome the cold cache issue when you first start up the server.
EXPLAIN (ANALYZE,BUFFERS)
SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
JOIN scheduler_operation_executions ON
scheduler_operation_executions.id = results.operation_execution_id
JOIN scheduler_task_executions ON scheduler_task_executions.id =
scheduler_operation_executions.task_execution_id
WHERE scheduler_task_executions.device_id = 97
AND results.data <> '<NullData/>'
AND results.data IS NOT NULL
AND results.object_id = 1955
AND results.attribute_id IN (4, 5) -- possibly a longer list here
AND results.data_access_result = 'SUCCESS'
ORDER BY results.attribute_id, results.timestamp DESC
LIMIT 2 -- limit by the length of the attributes list
If you query only on "results" with only the conditions that apply to "results", what is the expected number of rows, and what is the actual number of rows?
...
How can I improve it to be consistently fast (is it possible to get to
several milliseconds?)?
Maybe. Depends on the answer to my previous question.
What I don't really understand is why the nested loop has 3 loops
(three joined tables)?
Each parallel execution counts as a loop. There are 2 parallel workers, plus the leader also participates, making three.
And why does the first index scan indicate ~60k loops? And does it
really work? It doesn't seem to filter out any rows.
The parallel hash join returns about 20,000 rows, but I think that that is just for one of the three parallel executions, making about 60,000 in total. I don't know why one of the nodes report combined execution and the other just a single worker. Parallel queries are hard to understand. When I want to optimize a query that does parallel execution, I just turn off parallelism ("set max_parallel_workers_per_gather TO 0;") at first to make is simpler to understand.
Apparently everything with device_id = 97 just happens to pass all the rest of your filters. If you need those filters to make sure you get the right answer in all cases, then you need them. A lifeboat isn't useless just because your ship didn't happen to sink today.
Should I add an index only on (attribute_id, object_id)? And maybe
data_access_result?
Does it make sens to add it on a text column (results.data)?
Which parts of query you give are going to change from execution to execution?
Assuming the parts for object_id and attribute_id are variable and the rest are static, I think the optimal index would be "create index on results (object_id, attribute_id) where data IS NOT NULL and data <> '<NullData/>' and data_access_result = 'SUCCESS'"
Why does results.data have two different "spellings" for null data?
However, if the number of rows from "results" that meet all your criteria are high, the index won't make much of a difference. The planner has a fundamental choice to make, should it seek things with device_id = 97, and then check each of those to see if they satisfy your conditions on "results" fields conditions. Or, should it seek things that satisfy the "results" fields conditions, and then check each of those to see if they satisfy device_id = 97. It is currently doing the first of those. Whether it should be doing the second, and whether creating the index will cause it to switch to using the second, are two (separate) questions which can't be answered with the data given.
Cheers,
Jeff