Hello, We have 2 TPC-H queries which fetch the same tuples but have significant query execution time differences (22.0 times). We are sharing a pair of TPC-H queries that exhibit this performance difference: First query: SELECT "orders3"."o_comment", "orders3"."o_orderstatus", "orders3"."o_orderkey", "t17"."ps_partkey", "t17"."ps_supplycost", "t17"."ps_comment", "orders3"."o_clerk", "orders3"."o_totalprice", "t17"."ps_availqty", "t17"."ps_suppkey" FROM ( SELECT * FROM "partsupp" WHERE "ps_comment" LIKE ', even theodolites. regular, final theodolites eat after the carefully pending foxes. furiously regular deposits sleep slyly. carefully bold realms above the ironic dependencies haggle careful') AS "t17" LEFT JOIN "orders" AS "orders3" ON true ORDER BY "t17"."ps_supplycost"FETCH next 14 rows only Second query: SELECT "orders3"."o_comment", "orders3"."o_orderstatus", "orders3"."o_orderkey", "t17"."ps_partkey", "t17"."ps_supplycost", "t17"."ps_comment", "orders3"."o_clerk", "orders3"."o_totalprice", "t17"."ps_availqty", "t17"."ps_suppkey" FROM ( SELECT * FROM "partsupp" WHERE "ps_comment" LIKE ', even theodolites. regular, final theodolites eat after the carefully pending foxes. furiously regular deposits sleep slyly. carefully bold realms above the ironic dependencies haggle careful' ORDER BY "ps_supplycost"FETCH next 14 rows only) AS "t17" LEFT JOIN "orders" AS "orders3" ON true ORDER BY "t17"."ps_supplycost"FETCH next 14 rows only * Actual Behavior We executed both queries on the TPC-H benchmark of scale factor 5: the first query takes
over 8 seconds, while the second query only takes 0.3 seconds. * Query Execution Plan
QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=464628.69..464630.32 rows=14 width=223) (actual time=8082.764..8082.767 rows=14 loops=1) -> Gather Merge (cost=464628.69..1193917.91 rows=6250614 width=223) (actual time=8082.762..8087.639 rows=14 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=463628.66..471441.93 rows=3125307 width=223) (actual time=2933.506..2933.506 rows=5 loops=3) Sort Key: partsupp.ps_supplycost Sort Method: quicksort Memory: 25kB Worker 0: Sort Method: top-N heapsort Memory: 32kB Worker 1: Sort Method: quicksort Memory: 25kB -> Nested Loop Left Join (cost=0.00..388506.36 rows=3125307 width=223) (actual time=360.602..1643.471 rows=2500000 loops=3) -> Parallel Seq Scan on partsupp (cost=0.00..108031.62 rows=1 width=144) (actual time=360.577..360.599 rows=0 loops=3) Filter: ((ps_comment)::text ~~ ', even theodolites. regular, final theodolites eat after the carefully pending foxes. furiously regular deposits sleep slyly. carefully bold realms above the ironic dependencies haggle careful'::text) Rows Removed by Filter: 1333333 -> Seq Scan on orders orders3 (cost=0.00..205467.37 rows=7500737 width=79) (actual time=0.064..1544.990 rows=7500000 loops=1) Planning Time: 0.278 ms Execution Time: 8087.714 ms (16 rows)
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----- Limit (cost=109031.74..109032.26 rows=14 width=223) (actual time=363.883..363.890 rows=14 loops=1) -> Nested Loop Left Join (cost=109031.74..389506.49 rows=7500737 width=223) (actual time=363.882..363.887 rows=14 loops=1) -> Limit (cost=109031.74..109031.74 rows=1 width=144) (actual time=363.859..363.859 rows=1 loops=1) -> Sort (cost=109031.74..109031.74 rows=1 width=144) (actual time=363.858..363.858 rows=1 loops=1) Sort Key: partsupp.ps_supplycost Sort Method: quicksort Memory: 25kB -> Gather (cost=1000.00..109031.73 rows=1 width=144) (actual time=363.447..370.107 rows=1 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on partsupp (cost=0.00..108031.62 rows=1 width=144) (actual time=360.033..360.101 rows=0 loops=3) Filter: ((ps_comment)::text ~~ ', even theodolites. regular, final theodolites eat after the carefully pending foxes. furiously regular deposits sleep slyly. carefully bold realms above the ironic dependencies haggle careful'::t ext) Rows Removed by Filter: 1333333 -> Seq Scan on orders orders3 (cost=0.00..205467.37 rows=7500737 width=79) (actual time=0.016..0.017 rows=14 loops=1) Planning Time: 0.228 ms Execution Time: 370.200 ms (15 rows) *Expected Behavior Since these two queries are semantically equivalent, we were hoping that PostgreSQL
would evaluate them in roughly the same amount of time. *Test Environment Ubuntu 20.04 machine "Linux panda 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux" PostgreSQL v12.3 Database: TPC-H benchmark (with scale factor 5) The description of table partsupp and orders is as follows: tpch5=# \d partsupp; Table "public.partsupp" Column | Type | Collation | Nullable | Default ---------------+------------------------+-----------+----------+--------- ps_partkey | integer | | not null | ps_suppkey | integer | | not null | ps_availqty | integer | | not null | ps_supplycost | numeric(15,2) | | not null | ps_comment | character varying(199) | | not null | Indexes: "partsupp_pkey" PRIMARY KEY, btree (ps_partkey, ps_suppkey) Foreign-key constraints: "partsupp_fk1" FOREIGN KEY (ps_suppkey) REFERENCES supplier(s_suppkey) "partsupp_fk2" FOREIGN KEY (ps_partkey) REFERENCES part(p_partkey) Referenced by: TABLE "lineitem" CONSTRAINT "lineitem_fk2" FOREIGN KEY (l_partkey, l_suppkey) REFERENCES partsupp(ps_partkey, ps_suppkey) tpch5=# \d orders; Table "public.orders" Column | Type | Collation | Nullable | Default -----------------+-----------------------+-----------+----------+--------- o_orderkey | integer | | not null | o_custkey | integer | | not null | o_orderstatus | character(1) | | not null | o_totalprice | numeric(15,2) | | not null | o_orderdate | date | | not null | o_orderpriority | character(15) | | not null | o_clerk | character(15) | | not null | o_shippriority | integer | | not null | o_comment | character varying(79) | | not null | Indexes: "orders_pkey" PRIMARY KEY, btree (o_orderkey) Foreign-key constraints: "orders_fk1" FOREIGN KEY (o_custkey) REFERENCES customer(c_custkey) Referenced by: TABLE "lineitem" CONSTRAINT "lineitem_fk1" FOREIGN KEY (l_orderkey) REFERENCES orders(o_orderkey) *Here are the steps for reproducing our observations:
tar xzvf tpch5_postgresql.tar.gz cd tpch5_postgresql db=tpch5 createdb $db psql -d $db < dss.ddl for i in `ls *.tbl` do echo $i name=`echo $i|cut -d'.' -f1` psql -d $db -c "COPY $name FROM '`pwd`/$i' DELIMITER '|' ENCODING 'LATIN1';" done psql -d $db < dss_postgres.ri
|