Hi Team,
Greetings for the day!!
Platform:
PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit
Issue:
We have the base table which contains 22M records and we created a view on top of it while querying the view with ILIKE clause it took 44 seconds and with LIKE Clause 20 Seconds
Query:
fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';
select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';
Also attaching the comparison for both ILIKE and LIKE test performed.
Expectation:
How can we optimize our ILIKE query, since it is hardcoded in the application and we can't use any other keyword than ILIKE .
BR//
Aman Gupta
+918447611183
fm_db_custom_db=# \d+ test_20m_view View "mmsuper.test_20m_view" Column | Type | Modifiers | Storage | Description -------------------+-----------------------------+-----------+----------+------------- inputfilename | character varying(300) | | extended | source | character varying(80) | | extended | hostname | character varying(300) | | extended | configname | character varying(300) | | extended | logicalservername | character varying(300) | | extended | clustername | character varying(80) | | extended | inputtime | timestamp without time zone | | plain | totalinputbytes | bigint | | plain | totalinputcdrs | bigint | | plain | destination | character varying(80) | | extended | outputfilename | character varying(300) | | extended | processinglink | character varying(80) | | extended | totaloutputbytes | bigint | | plain | totaloutputcdrs | bigint | | plain | outputtime | timestamp without time zone | | plain | View definition: SELECT a67.sourceid AS inputfilename, a67.innodename AS source, a67.hostname, a67.configname, a67.logicalservername, ci.layered_clustername AS clustername, a67.intime AS inputtime, a73.bytes AS totalinputbytes, a73.cdrs AS totalinputcdrs, a68.outnodename AS destination, a68.destinationid AS outputfilename, a68.outnodeid AS processinglink, a68.bytes AS totaloutputbytes, a68.cdrs AS totaloutputcdrs, a68.outtime AS outputtime FROM test_report_auditlog2 a67 LEFT JOIN test_report_auditlog2 a73 ON a67.destinationid::text = a73.sourceid::text AND a67.event = 67 AND a73.event = 73 LEFT JOIN test_report_auditlog2 a80 ON a73.sourceid::text = a80.sourceid::text AND a80.event = 80 LEFT JOIN test_report_auditlog2 a68 ON a80.destinationid::text = a68.sourceid::text AND a68.event = 68 LEFT JOIN clusterinfo ci ON a68.hostname::text = ci.hostname::text; fm_db_custom_db=# fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF) fm_db_custom_db-# select destination, fm_db_custom_db-# hostname, fm_db_custom_db-# inputfilename, fm_db_custom_db-# inputtime, fm_db_custom_db-# logicalservername, fm_db_custom_db-# outputfilename, fm_db_custom_db-# outputtime, fm_db_custom_db-# processinglink, fm_db_custom_db-# source, fm_db_custom_db-# totalinputbytes, fm_db_custom_db-# totalinputcdrs, fm_db_custom_db-# totaloutputbytes, fm_db_custom_db-# totaloutputcdrs fm_db_custom_db-# from mmsuper.test_20m_view fm_db_custom_db-# where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=3141648.59..4290036.08 rows=23099 width=187) (actual rows=5 loops=1) Hash Cond: ((a68.hostname)::text = (ci.hostname)::text) -> Hash Left Join (cost=3141635.44..4289891.49 rows=23099 width=202) (actual rows=5 loops=1) Hash Cond: ((a80.destinationid)::text = (a68.sourceid)::text) -> Hash Right Join (cost=1956092.18..2930038.77 rows=12219 width=141) (actual rows=5 loops=1) Hash Cond: ((a80.sourceid)::text = (a73.sourceid)::text) -> Seq Scan on test_report_auditlog2 a80 (cost=0.00..947821.56 rows=6934090 width=85) (actual rows=6839071 loops=1) Filter: (event = 80) Rows Removed by Filter: 15737822 -> Hash (cost=1956063.82..1956063.82 rows=2269 width=146) (actual rows=3 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 33kB -> Hash Left Join (cost=1003882.60..1956063.82 rows=2269 width=146) (actual rows=3 loops=1) Hash Cond: ((a67.destinationid)::text = (a73.sourceid)::text) Join Filter: (a67.event = 67) -> Seq Scan on test_report_auditlog2 a67 (cost=0.00..947821.56 rows=2269 width=127) (actual rows=3 loops=1) Filter: ((sourceid)::text ~~* '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%'::text) Rows Removed by Filter: 22576890 -> Hash (cost=947821.56..947821.56 rows=4484883 width=61) (actual rows=4449566 loops=1) Buckets: 8388608 Batches: 1 Memory Usage: 453958kB -> Seq Scan on test_report_auditlog2 a73 (cost=0.00..947821.56 rows=4484883 width=61) (actual rows=4449566 loops=1) Filter: (event = 73) Rows Removed by Filter: 18127327 -> Hash (cost=947821.56..947821.56 rows=6995016 width=146) (actual rows=6838663 loops=1) Buckets: 8388608 Batches: 2 Memory Usage: 625310kB -> Seq Scan on test_report_auditlog2 a68 (cost=0.00..947821.56 rows=6995016 width=146) (actual rows=6838663 loops=1) Filter: (event = 68) Rows Removed by Filter: 15738230 -> Hash (cost=11.40..11.40 rows=140 width=178) (actual rows=20 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on clusterinfo ci (cost=0.00..11.40 rows=140 width=178) (actual rows=20 loops=1) Planning time: 4.313 ms Execution time: 44713.314 ms (32 rows) fm_db_custom_db=# fm_db_custom_db=# fm_db_custom_db=# fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF) fm_db_custom_db-# select destination, fm_db_custom_db-# hostname, fm_db_custom_db-# inputfilename, fm_db_custom_db-# inputtime, fm_db_custom_db-# logicalservername, fm_db_custom_db-# outputfilename, fm_db_custom_db-# outputtime, fm_db_custom_db-# processinglink, fm_db_custom_db-# source, fm_db_custom_db-# totalinputbytes, fm_db_custom_db-# totalinputcdrs, fm_db_custom_db-# totaloutputbytes, fm_db_custom_db-# totaloutputcdrs fm_db_custom_db-# from mmsuper.test_20m_view fm_db_custom_db-# where inputfilename like '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=3141648.59..4290036.08 rows=23099 width=187) (actual rows=5 loops=1) Hash Cond: ((a68.hostname)::text = (ci.hostname)::text) -> Hash Left Join (cost=3141635.44..4289891.49 rows=23099 width=202) (actual rows=5 loops=1) Hash Cond: ((a80.destinationid)::text = (a68.sourceid)::text) -> Hash Right Join (cost=1956092.18..2930038.77 rows=12219 width=141) (actual rows=5 loops=1) Hash Cond: ((a80.sourceid)::text = (a73.sourceid)::text) -> Seq Scan on test_report_auditlog2 a80 (cost=0.00..947821.56 rows=6934090 width=85) (actual rows=6839071 loops=1) Filter: (event = 80) Rows Removed by Filter: 15737822 -> Hash (cost=1956063.82..1956063.82 rows=2269 width=146) (actual rows=3 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 33kB -> Hash Left Join (cost=1003882.60..1956063.82 rows=2269 width=146) (actual rows=3 loops=1) Hash Cond: ((a67.destinationid)::text = (a73.sourceid)::text) Join Filter: (a67.event = 67) -> Seq Scan on test_report_auditlog2 a67 (cost=0.00..947821.56 rows=2269 width=127) (actual rows=3 loops=1) Filter: ((sourceid)::text ~~ '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%'::text) Rows Removed by Filter: 22576890 -> Hash (cost=947821.56..947821.56 rows=4484883 width=61) (actual rows=4449566 loops=1) Buckets: 8388608 Batches: 1 Memory Usage: 453958kB -> Seq Scan on test_report_auditlog2 a73 (cost=0.00..947821.56 rows=4484883 width=61) (actual rows=4449566 loops=1) Filter: (event = 73) Rows Removed by Filter: 18127327 -> Hash (cost=947821.56..947821.56 rows=6995016 width=146) (actual rows=6838663 loops=1) Buckets: 8388608 Batches: 2 Memory Usage: 625310kB -> Seq Scan on test_report_auditlog2 a68 (cost=0.00..947821.56 rows=6995016 width=146) (actual rows=6838663 loops=1) Filter: (event = 68) Rows Removed by Filter: 15738230 -> Hash (cost=11.40..11.40 rows=140 width=178) (actual rows=20 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on clusterinfo ci (cost=0.00..11.40 rows=140 width=178) (actual rows=20 loops=1) Planning time: 1.286 ms Execution time: 20208.514 ms (32 rows) fm_db_custom_db=# fm_db_custom_db=# \d test_report_auditlog2 Table "mmsuper.test_report_auditlog2" Column | Type | Modifiers --------------------------+-----------------------------+----------- event | smallint | innodeid | character varying(80) | innodename | character varying(80) | sourceid | character varying(300) | hostname | character varying(300) | configname | character varying(300) | logicalservername | character varying(300) | intime | timestamp without time zone | outnodeid | character varying(80) | outnodename | character varying(80) | destinationid | character varying(300) | outtime | timestamp without time zone | bytes | bigint | cdrs | bigint | recordsequencenumberlist | character varying(1000) | logdetailcdrs | character varying(10) | partial_cdrs | integer | duplicate_cdrs | integer | discarded_cdrs | integer | created_cdrs | integer | corrupted_cdrs | integer | created_files | integer | duplicate_files | integer | corrupted_files | integer | partial_files | integer | discarded_files | integer | empty_files | integer | fm_db_custom_db=#