Re: Erratically behaving query needs optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 25, 2019 at 5:51 PM Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
>
> Yes, it certainly looks like it is due to cold caches.  But you say it is slow at first, and then say it varies greatly during a run.  Is being slow at first the only way it varies greatly, or is there large variation even beyond that?

There is a great variation in run times (hundreds of ms to several
seconds) even beyond the start of the server.
The query runs several times with a different device_id, object_id and
another list of attribute_ids and it varies from one another.

> You can use pg_rewarm to overcome the cold cache issue when you first start up the server.

I cannot find anything related to pg_rewarm other than some dead ends
from 2013 from which I gather it only works on Linux.
Anyway, I have problems even beyond the start of the database, it's
just easier to reproduce the problem at the start, otherwise I have to
leave the application running for a while (to invalidate part of the
cache I think).

> If you query only on "results" with only the conditions that apply to "results", what is the expected number of rows, and what is the actual number of rows?

Explain for the query on results only: https://explain.depesz.com/s/Csau

EXPLAIN (ANALYZE,BUFFERS)
 SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
 WHERE
     results.data <> '<NullData/>'
     AND results.data IS NOT NULL
     AND results.object_id = 1955
     AND results.attribute_id IN (4, 5) -- possibly a longer list here
     AND results.data_access_result = 'SUCCESS'
 ORDER BY results.attribute_id, results.timestamp DESC
 LIMIT 2 -- limit by the length of the attributes list

Limit  (cost=166793.28..167335.52 rows=2 width=54) (actual
time=134783.510..134816.941 rows=2 loops=1)
  Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
  ->  Unique  (cost=166793.28..168420.01 rows=6 width=54) (actual
time=134783.507..134816.850 rows=2 loops=1)
        Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
        ->  Sort  (cost=166793.28..167606.64 rows=325346 width=54)
(actual time=134783.505..134802.602 rows=205380 loops=1)
              Sort Key: attribute_id, "timestamp" DESC
              Sort Method: external merge  Disk: 26456kB
              Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
              ->  Gather  (cost=1000.00..125882.23 rows=325346
width=54) (actual time=32.325..133815.793 rows=410749 loops=1)
                    Workers Planned: 2
                    Workers Launched: 2
                    Buffers: shared hit=19086 read=46836
                    ->  Parallel Seq Scan on results
(cost=0.00..92347.63 rows=135561 width=54) (actual
time=18.496..133871.888 rows=136916 loops=3)
                          Filter: ((data IS NOT NULL) AND (data <>
'<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
                          Rows Removed by Filter: 920123
                          Buffers: shared hit=19086 read=46836
Planning Time: 5.071 ms
Execution Time: 134874.687 ms

As far as I can see the estimates are close to the real returned rows
in the "Parallel Seq Scan on results".
The numbers are similar (of course) if I turn off query parallelism.
Or should I VACUUM ANALYZE again?
I'm sure I ran it enough.

>> How can I improve it to be consistently fast (is it possible to get to
>> several milliseconds?)?
>
>
> Maybe.  Depends on the answer to my previous question.
>
>>
>> What I don't really understand is why the nested loop has 3 loops
>> (three joined tables)?
>
>
> Each parallel execution counts as a loop.  There are 2 parallel workers, plus the leader also participates, making three.
>
>>
>> And why does the first index scan indicate ~60k loops? And does it
>> really work? It doesn't seem to filter out any rows.
>
>
> The parallel hash join returns about 20,000 rows, but I think that that is just for one of the three parallel executions, making about 60,000 in total.  I don't know why one of the nodes report combined execution and the other just a single worker.  Parallel queries are hard to understand.  When I want to optimize a query that does parallel execution, I just turn off parallelism ("set max_parallel_workers_per_gather TO 0;") at first to make is simpler to understand.
>
> Apparently everything with device_id = 97 just happens to pass all the rest of your filters.  If you need those filters to make sure you get the right answer in all cases, then you need them.  A lifeboat isn't useless just because your ship didn't happen to sink today.

The part with the workers makes sense, thanks.
For the condition, I thought there is something more contrived going
on in the planner, but I failed to see it was that simple: there is no
need to remove anything since everything matches the condition.

>> Should I add an index only on (attribute_id, object_id)? And maybe
>> data_access_result?
>> Does it make sens to add it on a text column (results.data)?
>
>
> Which parts of query you give are going to change from execution to execution?
>
> Assuming the parts for object_id and attribute_id are variable and the rest are static, I think the optimal index would be "create index on results (object_id, attribute_id) where data IS NOT NULL and data <> '<NullData/>' and data_access_result = 'SUCCESS'"

That's right, object_id and attribute_id are variable, the device_id
is variable as well.
If I create that index it is not chosen by the planner when executing
the full "joined" query, it is though if I run it only on "results"
related conditions, without joining, which makes sense corroborated
with what you said below.

> Why does results.data have two different "spellings" for null data?

One where the device couldn't be contacted (NULL), one where the
device was contacted and for my (object_id, attribute_ids)
combinations actually returned null data ('<NullData/>').

> However, if the number of rows from "results" that meet all your criteria are high, the index won't make much of a difference.  The planner has a fundamental choice to make, should it seek things with device_id = 97, and then check each of those to see if they satisfy your conditions on "results" fields conditions.  Or, should it seek things that satisfy the "results" fields conditions, and then check each of those to see if they satisfy device_id = 97.  It is currently doing the first of those.  Whether it should be doing the second, and whether creating the index will cause it to switch to using the second, are two (separate) questions which can't be answered with the data given.

So maybe I should de-normalize and place the device_id column into the
"results" table and add it to the index in your suggestion above?

> Cheers,
>
> Jeff

Thank you for the detailed response.

-- 

Barbu Paul - Gheorghe






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux