Am 25.01.19 um 06:20 schrieb ldh@xxxxxxxxxxxxxxxxxx:
Hello,
We have been stuck for the past week on a query that simply won’t
“execute”. We have a table with 1.2B rows that took around 14h to
load, but a simple select takes forever and after 10h, no records are
coming through still.
Environment:
- Table tmp_outpatient_rev with 41 VARCHAR columns
(desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd,
rev_cntr, rev_cntr_dt, …)
- 1.2B rows (Billion with a ‘B’)
- A single Unique Index on columns desy_sort_key, claim_no,
clm_line_num
- select pg_size_pretty(pg_relation_size('tmp_outpatient_rev'))
--> 215GB
- Database Server: 64GB, 8 cores/16 threads, HDDs 10K
- Linux
- PG 11.1
Query:
select * from tmp_outpatient_rev order by desy_sort_key, claim_no
Plan:
Gather Merge (cost=61001461.16..216401602.29 rows=1242732290
width=250)
Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt,
nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
Workers Planned: 10
-> Sort (cost=61000460.97..61311144.04 rows=124273229 width=250)
Output: desy_sort_key, claim_no, clm_line_num,
clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
Sort Key: tmp_outpatient_rev.desy_sort_key,
tmp_outpatient_rev.claim_no
-> Parallel Seq Scan on public.tmp_outpatient_rev
(cost=0.00..29425910.29 rows=124273229 width=250)
Output: desy_sort_key, claim_no, clm_line_num,
clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
Method of access:
- Using Pentaho Kettle (an ETL tool written in Java and using
JDBC), we simply issue the query and expect records to start streaming
in ASAP.
- Issue was replicated with really basic JDBC code in a Java test
program.
- The database doesn't have much other data and the table was
loaded from a CSV data source with LOAD over something like 14h
(average throughput of about 25K rows/s)
- Settings:
alter database "CMS_TMP" set seq_page_cost=1;
alter database "CMS_TMP" set random_page_cost=4;
alter database "CMS_TMP" set enable_seqscan=true;
JDBC connection string with no extra params.
Database has been generally configured properly.
Problem:
- The plan shows a full table scan followed by a sort, and then a
gather merge. With 1.2B rows, that's crazy to try to sort that 😊
- After 10h, the query is still "silent" and no record is
streaming in. IO is very high (80-90% disk throughput utilization) on
the machine (the sort…).
- I have tried to hack the planner to force an index scan (which
would avoid the sort/gather steps and should start streaming data
right away), in particular, enable_seqscan=false or seq_page_cost=2.
This had ZERO impact on the plan to my surprise.
- I changed the “order by” to include all 3 columns from the index,
or created a non-unique index with only the first 2 columns, all to no
effect whatsoever either.
- The table was written over almost 14h at about 25K row/s and it
seems to me I should be able to read the data back at least as fast.
Why is a simple index scan not used? Why are all our efforts to try to
force the use of the index failing?
the query isn't that simple, there is no where condition, so PG has to
read the whole table and the index is useless. Would it be enought to
select only the columns covered by the index?
(run a vacuum on the table after loading the data, that's can enable a
index-only-scan in this case)
Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com