Hi all,
I am running TPC-H on recent postgresql (12.0 and 12.1).
On some of the queries (that may involve parallel scans) I see this interesting behavior:
When these queries are executed back-to-back (sent from psql interactive terminal), the total execution time of them increase monotonically.
I simplified query-1 to demonstrate this effect:
``` example.sql
explain (analyze, buffers) select
max(l_shipdate) as max_data,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '20' day;
```
max(l_shipdate) as max_data,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '20' day;
```
When I execute (from fish) following command:
`for i in (seq 1 20); psql tpch < example.sql | grep Execution; end`
The results I get are as follows:
"
Execution Time: 184.864 ms
Execution Time: 192.758 ms
Execution Time: 197.380 ms
Execution Time: 200.384 ms
Execution Time: 202.950 ms
Execution Time: 205.695 ms
Execution Time: 208.082 ms
Execution Time: 209.108 ms
Execution Time: 212.428 ms
Execution Time: 214.539 ms
Execution Time: 215.799 ms
Execution Time: 219.057 ms
Execution Time: 222.102 ms
Execution Time: 223.779 ms
Execution Time: 227.819 ms
Execution Time: 229.710 ms
Execution Time: 239.439 ms
Execution Time: 237.649 ms
Execution Time: 249.178 ms
Execution Time: 261.268 ms
"
`for i in (seq 1 20); psql tpch < example.sql | grep Execution; end`
The results I get are as follows:
"
Execution Time: 184.864 ms
Execution Time: 192.758 ms
Execution Time: 197.380 ms
Execution Time: 200.384 ms
Execution Time: 202.950 ms
Execution Time: 205.695 ms
Execution Time: 208.082 ms
Execution Time: 209.108 ms
Execution Time: 212.428 ms
Execution Time: 214.539 ms
Execution Time: 215.799 ms
Execution Time: 219.057 ms
Execution Time: 222.102 ms
Execution Time: 223.779 ms
Execution Time: 227.819 ms
Execution Time: 229.710 ms
Execution Time: 239.439 ms
Execution Time: 237.649 ms
Execution Time: 249.178 ms
Execution Time: 261.268 ms
"
In addition, if the repeated more times, the total execution time can end up being 10X and more!!!
When there a wait period in-between queries, (e.g. sleep 10) in the above for loop, this increasing execution time behavior goes a way.
For more complex queries, the "wait period" needs to be longer to avoid the increase in execution time.
Some metadata about this table "lineitem":
tpch=# \d lineitem
Table "public.lineitem"
Column | Type | Collation | Nullable | Default
-----------------+-----------------------+-----------+----------+---------
l_orderkey | integer | | not null |
l_partkey | integer | | not null |
l_suppkey | integer | | not null |
l_linenumber | integer | | not null |
l_quantity | numeric(15,2) | | not null |
l_extendedprice | numeric(15,2) | | not null |
l_discount | numeric(15,2) | | not null |
l_tax | numeric(15,2) | | not null |
l_returnflag | character(1) | | not null |
l_linestatus | character(1) | | not null |
l_shipdate | date | | not null |
l_commitdate | date | | not null |
l_receiptdate | date | | not null |
l_shipinstruct | character(25) | | not null |
l_shipmode | character(10) | | not null |
l_comment | character varying(44) | | not null |
Indexes:
"i_l_commitdate" btree (l_commitdate)
"i_l_orderkey" btree (l_orderkey)
"i_l_orderkey_quantity" btree (l_orderkey, l_quantity)
"i_l_partkey" btree (l_partkey)
"i_l_receiptdate" btree (l_receiptdate)
"i_l_shipdate" btree (l_shipdate)
"i_l_suppkey" btree (l_suppkey)
"i_l_suppkey_partkey" btree (l_partkey, l_suppkey)
Table "public.lineitem"
Column | Type | Collation | Nullable | Default
-----------------+-----------------------+-----------+----------+---------
l_orderkey | integer | | not null |
l_partkey | integer | | not null |
l_suppkey | integer | | not null |
l_linenumber | integer | | not null |
l_quantity | numeric(15,2) | | not null |
l_extendedprice | numeric(15,2) | | not null |
l_discount | numeric(15,2) | | not null |
l_tax | numeric(15,2) | | not null |
l_returnflag | character(1) | | not null |
l_linestatus | character(1) | | not null |
l_shipdate | date | | not null |
l_commitdate | date | | not null |
l_receiptdate | date | | not null |
l_shipinstruct | character(25) | | not null |
l_shipmode | character(10) | | not null |
l_comment | character varying(44) | | not null |
Indexes:
"i_l_commitdate" btree (l_commitdate)
"i_l_orderkey" btree (l_orderkey)
"i_l_orderkey_quantity" btree (l_orderkey, l_quantity)
"i_l_partkey" btree (l_partkey)
"i_l_receiptdate" btree (l_receiptdate)
"i_l_shipdate" btree (l_shipdate)
"i_l_suppkey" btree (l_suppkey)
"i_l_suppkey_partkey" btree (l_partkey, l_suppkey)
tpch=# SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='lineitem';
relname | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------+----------+--------------+---------------+---------+----------+----------------+------------+---------------
lineitem | 112503 | 6.001167e+06 | 112503 | r | 16 | f | | 921903104
(1 row)
relname | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------+----------+--------------+---------------+---------+----------+----------------+------------+---------------
lineitem | 112503 | 6.001167e+06 | 112503 | r | 16 | f | | 921903104
(1 row)
Postgresql 12.0 and 12.1 are all manually installed from source.
Both are running on Ubuntu 16.04 kernel 4.4.0-142-generic, on Intel(R) Core(TM) i7-6700K.
Any help greatly appreciated!
Shijia