Re: Very slow query (3-4mn) on a table with 25millions rows

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tom,
Thanks for the hints..

I made various tests for index
The best I could get is the following one with 
create index vat_funcvaratt_multi_idx on functionalvarattributes(split_part(split_part(attvalue,' ',1),'.',1), tag_id, atttype_id);
analyze functionalvarattributes;

explain analyze select s.attvalue from functionalvarattributes s, tags t, variableattributetypes vat where t.id=s.tag_id and t.status!='Internal'and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15) except select s.attvalue from functionalvarattributes s, tags t, usertemplvarattribute utva, usertemplatevariable utv, variableattributetypes vat where vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and utv.id=utva.usertempvariable_fk and utv.usertempl_id=15 and t.id=s.tag_id and t.status!='Internal'and split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15);
                                                                                                 QUERY PLAN                            
                                                                     
---------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------
 HashSetOp Except  (cost=171505.51..2361978.74 rows=1116 width=8) (actual time=66476.682..66476.682 rows=0 loops=1)
   ->  Append  (cost=171505.51..2251949.02 rows=44011889 width=8) (actual time=12511.639..66476.544 rows=320 loops=1)
         ->  Subquery Scan on "*SELECT* 1"  (cost=171505.51..907368.77 rows=310121 width=8) (actual time=12511.638..31775.404 rows=2 lo
ops=1)
               ->  Hash Join  (cost=171505.51..904267.56 rows=310121 width=8) (actual time=12511.636..31775.401 rows=2 loops=1)
                     Hash Cond: (split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1) = (e.name)::text)
                     ->  Hash Join  (cost=193.91..726325.20 rows=310121 width=8) (actual time=1.227..24083.777 rows=308287 loops=1)
                           Hash Cond: (s.tag_id = t.id)
                           ->  Hash Join  (cost=188.03..716951.08 rows=1671210 width=16) (actual time=1.157..23810.490 rows=651155 loop
s=1)
                                 Hash Cond: (s.atttype_id = vat.id)
                                 ->  Seq Scan on functionalvarattributes s  (cost=0.00..604688.60 rows=25429960 width=24) (actual time=
0.002..15719.449 rows=25429808 loops=1)
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=1.116..1.116 rows=388 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8) (actual time=0.
005..0.987 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY ('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.064..0.064 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.008..0.055 rows=36 loops=1)
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                     ->  Hash  (cost=171250.07..171250.07 rows=4923 width=24) (actual time=7377.344..7377.344 rows=16 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual time=7377.310..7377.329 rows=16 lo
ops=1)
                                 ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual time=3.178..7377.271 rows=48 loops=1)
                                       Hash Cond: (e.usertemplatevar_id = ut.id)
                                       ->  Seq Scan on functionalvariables e  (cost=0.00..155513.07 rows=4164607 width=32) (actual time
=1.271..5246.277 rows=4164350 loops=1)
                                       ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.026..0.026 rows=16 loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut  (cost=0.29..8.75 rows=16
 width=8) (actual time=0.011..0.020 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
         ->  Subquery Scan on "*SELECT* 2"  (cost=172514.13..1344580.25 rows=43701768 width=8) (actual time=11551.477..34701.030 rows=3
18 loops=1)
               ->  Hash Join  (cost=172514.13..907562.57 rows=43701768 width=8) (actual time=11551.475..34700.876 rows=318 loops=1)
                     Hash Cond: (split_part(split_part((s_1.attvalue)::text, ' '::text, 1), '.'::text, 1) = (e_1.name)::text)
                     ->  Hash Join  (cost=193.91..726325.20 rows=310121 width=8) (actual time=1.281..27733.991 rows=308287 loops=1)
                           Hash Cond: (s_1.tag_id = t_1.id)
                           ->  Hash Join  (cost=188.03..716951.08 rows=1671210 width=16) (actual time=1.194..27391.475 rows=651155 loop
s=1)
                                 Hash Cond: (s_1.atttype_id = vat_1.id)
                                 ->  Seq Scan on functionalvarattributes s_1  (cost=0.00..604688.60 rows=25429960 width=24) (actual tim
e=0.001..17189.172 rows=25429808 loops=1)
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=1.153..1.153 rows=388 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       ->  Seq Scan on variableattributetypes vat_1  (cost=0.00..183.18 rows=388 width=8) (actual time=
0.007..1.015 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY ('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.065..0.065 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 ->  Seq Scan on tags t_1  (cost=0.00..5.43 rows=36 width=8) (actual time=0.010..0.053 rows=36 loops=1)
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                     ->  Hash  (cost=172318.46..172318.46 rows=141 width=24) (actual time=6553.620..6553.620 rows=2544 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 134kB
                           ->  Nested Loop  (cost=171201.54..172318.46 rows=141 width=24) (actual time=6550.096..6552.789 rows=2544 loo
ps=1)
                                 ->  Nested Loop  (cost=171201.12..172243.46 rows=16 width=32) (actual time=6550.077..6550.305 rows=256
 loops=1)
                                       ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual time=6542.508..6542.53
5 rows=16 loops=1)
                                             ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual time=12.705..6542.472 row
s=48 loops=1)
                                                   Hash Cond: (e_1.usertemplatevar_id = ut_1.id)
                                                   ->  Seq Scan on functionalvariables e_1  (cost=0.00..155513.07 rows=4164607 width=32
) (actual time=7.324..5008.051 rows=4164350 loops=1)
                                                   ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.033..0.033 rows=16 loops=
1)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                         ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut_1  (cost=0.29
..8.75 rows=16 width=8) (actual time=0.018..0.026 rows=16 loops=1)
                                                               Index Cond: (usertempl_id = 15)
                                       ->  Materialize  (cost=0.29..8.83 rows=16 width=8) (actual time=0.473..0.478 rows=16 loops=16)
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv  (cost=0.29..8.75 rows=1
6 width=8) (actual time=0.032..0.041 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
                                 ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute utva  (cost=0.4
2..4.60 rows=9 width=8) (actual time=0.002..0.004 rows=10 loops=256)
                                       Index Cond: (usertempvariable_fk = utv.id)
                                       Heap Fetches: 0
 Total runtime: 66476.942 ms
(67 rows)

Is this acceptable or can I get better results?
Thanks
Lana

>>-----Original Message-----
>>From: Tom Lane [mailto:tgl@xxxxxxxxxxxxx]
>>Sent: 25 July 2016 20:07
>>To: Abadie Lana
>>Cc: pgsql-performance@xxxxxxxxxxxxxx
>>Subject: Re:  Very slow query (3-4mn) on a table with 25millions
>>rows
>>
>>Abadie Lana <Lana.Abadie@xxxxxxxx> writes:
>>> I'm having a problem with a slow query - I tried several things to optimize the
>>queries but didn't really help. The output of explain analyse shows sequential
>>scan on a table of 25 million rows. Even though it is indexed and (I put a multi-
>>column index on the fields used in the query), the explain utility shows no usage
>>of the scan...
>>
>>That index looks pretty useless judging from the rowcounts, so I'm not surprised
>>that the planner didn't use it.  You might have better luck with an index on the
>>split_part expression
>>
>>split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1)
>>
>>since it's the join of that to e.name that seems to be actually selective.
>>(The planner doesn't appear to realize that it is, but ANALYZE'ing after creating
>>the index should fix that.)
>>
>>			regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux