Zero throughput on a query on a very large table.

"ldh@xxxxxxxxxxxxxxxxxx" <ldh@xxxxxxxxxxxxxxxxxx> · Fri, 25 Jan 2019 05:20:00 +0000

Hello,

We have been stuck for the past week on a query that simply won’t “execute”. We have a table with 1.2B rows that took around 14h to load, but a simple select takes forever and after 10h, no records are coming through still.

Environment:
     - Table tmp_outpatient_rev with 41 VARCHAR columns (desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …)
     - 1.2B rows (Billion with a ‘B’)
     - A single Unique Index on columns desy_sort_key, claim_no, clm_line_num
     - select pg_size_pretty(pg_relation_size('tmp_outpatient_rev')) --> 215GB
     - Database Server: 64GB, 8 cores/16 threads, HDDs 10K
     - Linux
     - PG 11.1

Query:
     select * from tmp_outpatient_rev order by desy_sort_key, claim_no

Plan:
    Gather Merge  (cost=61001461.16..216401602.29 rows=1242732290 width=250)
      Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
      Workers Planned: 10
      ->  Sort  (cost=61000460.97..61311144.04 rows=124273229 width=250)
            Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …
            Sort Key: tmp_outpatient_rev.desy_sort_key, tmp_outpatient_rev.claim_no
            ->  Parallel Seq Scan on public.tmp_outpatient_rev  (cost=0.00..29425910.29 rows=124273229 width=250)
                  Output: desy_sort_key, claim_no, clm_line_num, clm_thru_dt, nch_clm_type_cd, rev_cntr, rev_cntr_dt, …

Method of access:
    - Using Pentaho Kettle (an ETL tool written in Java and using JDBC), we simply issue the query and expect records to start streaming in ASAP.
   - Issue was replicated with really basic JDBC code in a Java test program.
    - The database doesn't have much other data and the table was loaded from a CSV data source with LOAD over something like 14h (average throughput of about 25K rows/s)
    - Settings:
              alter database "CMS_TMP" set seq_page_cost=1;
              alter database "CMS_TMP" set random_page_cost=4;
              alter database "CMS_TMP" set enable_seqscan=true;
              JDBC connection string with no extra params.
              Database has been generally configured properly.

Problem:
    - The plan shows a full table scan followed by a sort, and then a gather merge. With 1.2B rows, that's crazy to try to sort that
😊
    - After 10h, the query is still "silent" and no record is streaming in. IO is very high (80-90% disk throughput utilization) on the machine (the sort…).
    - I have tried to hack the planner to force an index scan (which would avoid the sort/gather steps and should start streaming data right away), in particular, enable_seqscan=false or seq_page_cost=2. This had ZERO impact on the plan
 to my surprise.
   - I changed the “order by” to include all 3 columns from the index, or created a non-unique index with only the first 2 columns, all to no effect whatsoever either.
    - The table was written over almost 14h at about 25K row/s and it seems to me I should be able to read the data back at least as fast.

Why is a simple index scan not used? Why are all our efforts to try to force the use of the index failing?

Any help is very much appreciated as we are really hitting a wall here with that table.

Thank you so much.

Laurent Hasson