Re: Specific query taking time to process

Duncan Whitham <dwhitham@xxxxxxxxx> · Fri, 31 Jan 2020 11:01:20 +0000

Thanks for the reply Tom - you've been a great help. 
I had been looking at changing - default_statistics_target in a broadbrush fashion - changing it to 1000 on alf_node, alf_node_aspects and alf_node_properties and that gets the query to run fine irrespective of work_mem settings. 

I then set them all back to default and reanalyzed.

I then applied the change just to the qname_id column on alf_node_aspects -

alter table alf_node_aspects alter qname_id SET
STATISTICS 1000;

reanalyzed alf_node_aspects and the estimate is much better ....

Nested Loop Semi Join  (cost=618944.02..3000749.68 rows=8 width=8) (actual time=5391.271..31441.085 rows=212 loops=1)
   Buffers: shared hit=756 read=2189959 written=1
   ->  Hash Semi Join  (cost=618943.45..3000659.89 rows=37 width=16) (actual time=5391.212..31437.065 rows=212 loops=1)
         Hash Cond: (node.id = prop.node_id)
         Buffers: shared hit=216 read=2189573 written=1
         ->  Bitmap Heap Scan on alf_node node  (cost=618862.32..2989274.54 rows=4290813 width=8) (actual time=4806.877..28240.746 rows=41976707 loops=1)
               Recheck Cond: ((store_id = 6) AND (type_qname_id = 240))
               Rows Removed by Index Recheck: 57797868
               Filter: (NOT (hashed SubPlan 1))
               Rows Removed by Filter: 2453
               Heap Blocks: exact=63327 lossy=1908882
               Buffers: shared hit=191 read=2189499 written=1
               ->  Bitmap Index Scan on idx_alf_node_mdq  (cost=0.00..600678.68 rows=19600211 width=0) (actual time=4773.927..4773.927 rows=41990879 loops=1)
                     Index Cond: ((store_id = 6) AND (type_qname_id = 240))
                     Buffers: shared read=214101 written=1
               SubPlan 1
                 ->  Index Scan using fk_alf_nasp_qn on alf_node_aspects aspect_1  (cost=0.57..17099.41 rows=4611 width=8) (actual time=0.036..13.453 rows=4909 loops=1)
                       Index Cond: (qname_id = 251)
                       Buffers: shared hit=191 read=3189
         ->  Hash  (cost=70.20..70.20 rows=875 width=8) (actual time=0.363..0.363 rows=228 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 17kB
               Buffers: shared hit=25 read=74
               ->  Index Only Scan using idx_alf_nprop_s on alf_node_properties prop  (cost=0.70..70.20 rows=875 width=8) (actual time=0.047..0.337 rows=228 loops=1)
                     Index Cond: ((qname_id = '242'::bigint) AND (string_value = 'S434071'::text))
                     Heap Fetches: 2
                     Buffers: shared hit=25 read=74
   ->  Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect  (cost=0.57..3.99 rows=2 width=8) (actual time=0.016..0.016 rows=1 loops=212)
         Index Cond: ((node_id = node.id) AND (qname_id = 245))
         Heap Fetches: 15
         Buffers: shared hit=540 read=386
 Planning time: 0.903 ms
 Execution time: 31441.206 ms
(32 rows)

But there is still a heap scan on alf_node instead of using the index - what should my next move be ? Change the stats collection level on store_id and  type_qname_id on alf_node - or reindex alf_node ? 

On Thu, Jan 30, 2020 at 4:36 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Duncan Whitham <dwhitham@xxxxxxxxx> writes:

> We now only need 1 environment as we  can replicate the performance problem

> on a copy of live – snapshot/restore from AWS of live. We now have a vacuum

> analyse running every night on the 3 tables in question on live – to

> eliminate bloat and inaccurate stats as the root of the problem.

Hmm, doesn't seem like that's getting the job done.  I can see at

least one serious misestimate in these plans:

>                        ->  Bitmap Heap Scan on alf_node_aspects aspect_1

> (cost=3420.59..418372.63 rows=163099 width=8) (actual time=1.402..5.243

> rows=4909 loops=1)

>                              Recheck Cond: (qname_id = 251)

It doesn't seem to me that such a simple condition ought to be

misestimated by a factor of 30, so either you need to crank up

the stats target for this column or you need to analyze the

table more often.

The other rowcount estimates don't seem so awful, but this one is

contributing to the planner thinking that "SubPlan 1" is going to

be very expensive, which probably accounts for it trying to avoid

what's actually a cheap plan.

                        regards, tom lane

-- 
   Duncan Whitham Developer, Zaizi 
| m: (+44)751 502 7049 | t: (+44)20 3582 8330  
| e: dwhitham@xxxxxxxxx | w: http://www.zaizi.com/

This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. 

Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is Kings House, 174 Hammersmith Road, London W6 7JP.