Hi there,
I'm wondering if anyone has any insight into what might make the database choose a sequential scan for a query (table defs and plan below) like :
SELECT orders.orderid FROM orders
WHERE (
orders.orderid IN ('546111')
OR
orders.orderid IN (select orderid FROM orderstotrans WHERE (transid IN ('546111')))
);
I have a couple of environments, all on Postgresql 13.7 and:
- on one the query executes with an sequential scan on the orders table
- on the other sequential scan on an index (ie walks index and filters, rather than looking up ids on the index as an index condition.)
Plan and tables are below, but it seems to me that the planner knows the subplan is going to return 1 row (max) and should "know" that there is a max of 2 IDs to look up an indexes would be faster than a sequential scan (of either table or index) and filter. I've tried re analyzing to make sure stats are good and it hasn't helped
I can get a good plan that does use the index efficiently by using a union, eg:
select orders.orderid FROM orders
WHERE (
orders.orderid IN (
SELECT '546111'
UNION
SELECT orderid FROM orderstotrans WHERE (transid IN ('546111'))
)
);
but I want to understand what warning signs I should be aware of with the original query that put it on the path of a bad plan, so I don't do it again.
Plan - seq scan of table:
=====
> explain
select orders.orderid FROM orders WHERE (orders.orderid IN ('546111') OR orders.orderid IN (select orderid FROM orderstotrans WHERE (transid IN ('546111'))));
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on orders (cost=8.45..486270.87 rows=4302781 width=8)
Filter: ((orderid = '546111'::bigint) OR (hashed SubPlan 1))
SubPlan 1
-> Index Scan using orderstotrans_transid_key on orderstotrans (cost=0.43..8.45 rows=1 width=8)
Index Cond: (transid = '546111'::bigint)
(5 rows)
=====
Plan - Seq scan and filter of index:
=====
> explain select orders.orderid FROM orders WHERE (orders.orderid IN ('546111') OR orders.orderid IN (select orderid FROM orderstotrans WHERE (transid IN ('546111'))));
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Index Only Scan using orders_pkey on orders (cost=9.16..4067888.60 rows=64760840 width=8)
Filter: ((orderid = '546111'::bigint) OR (hashed SubPlan 1))
SubPlan 1
-> Index Scan using orderstotrans_transid_key on orderstotrans (cost=0.57..8.59 rows=1 width=8)
Index Cond: (transid = '546111'::bigint)
(5 rows)
=====
Tables:
=====
Table "test.orders"
Column | Type | Collation | Nullable | Default
----------------------+-----------------------------+-----------+----------+--------------
orderid | bigint | | not null |
istest | smallint | | not null | 0
orderstatusid | integer | | |
customername | text | | |
customeraddress | text | | |
customercountry | text | | |
customercity | text | | |
customerstate | text | | |
customerzip | text | | |
"orders_pkey" PRIMARY KEY, btree (orderid)
Table "test.orderstotrans"
Column | Type | Collation | Nullable | Default
-------------+---------+-----------+----------+---------
orderid | bigint | | |
transid | bigint | | |
orderitemid | integer | | |
Indexes:
"orderstotrans_orderid_idx" btree (orderid)
"orderstotrans_orderitemid_idx" btree (orderitemid)
"orderstotrans_transid_key" UNIQUE, btree (transid)
Happier plan for the union version:
====
explain select orders.orderid FROM orders
WHERE (
orders.orderid IN (
SELECT '3131275553'
UNION
select orderid FROM orderstotrans WHERE (transid IN ('3131275553'))
)
);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=9.21..21.84 rows=2 width=8) (actual time=0.034..0.043 rows=1 loops=1)
-> Unique (cost=8.64..8.65 rows=2 width=8) (actual time=0.024..0.026 rows=2 loops=1)
-> Sort (cost=8.64..8.64 rows=2 width=8) (actual time=0.023..0.024 rows=2 loops=1)
Sort Key: ('3131275553'::bigint)
Sort Method: quicksort Memory: 25kB
-> Append (cost=0.00..8.63 rows=2 width=8) (actual time=0.001..0.019 rows=2 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=1)
-> Index Scan using orderstotrans_transid_key on orderstotrans (cost=0.57..8.59 rows=1 width=8) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (transid = '3131275553'::bigint)
-> Index Only Scan using orders_pkey on orders (cost=0.57..6.58 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2)
Index Cond: (orderid = ('3131275553'::bigint))
Heap Fetches: 0
Planning Time: 0.165 ms
Execution Time: 0.065 ms
(14 rows)
====
(though that plan is a bit misleading, as that index condition isn't exactly what is used, ie with:
select orders.orderid FROM orders
WHERE (
orders.orderid IN (
SELECT '3131275553'
UNION
select orderid FROM orderstotrans WHERE (transid IN ('3131275553'))
)
);
orderid
-----------
439155713
(1 row)
the orderid it matches, isn't the one the planner showed, but it works)
I'm wondering if anyone has any insight into what might make the database choose a sequential scan for a query (table defs and plan below) like :
SELECT orders.orderid FROM orders
WHERE (
orders.orderid IN ('546111')
OR
orders.orderid IN (select orderid FROM orderstotrans WHERE (transid IN ('546111')))
);
I have a couple of environments, all on Postgresql 13.7 and:
- on one the query executes with an sequential scan on the orders table
- on the other sequential scan on an index (ie walks index and filters, rather than looking up ids on the index as an index condition.)
Plan and tables are below, but it seems to me that the planner knows the subplan is going to return 1 row (max) and should "know" that there is a max of 2 IDs to look up an indexes would be faster than a sequential scan (of either table or index) and filter. I've tried re analyzing to make sure stats are good and it hasn't helped
I can get a good plan that does use the index efficiently by using a union, eg:
select orders.orderid FROM orders
WHERE (
orders.orderid IN (
SELECT '546111'
UNION
SELECT orderid FROM orderstotrans WHERE (transid IN ('546111'))
)
);
but I want to understand what warning signs I should be aware of with the original query that put it on the path of a bad plan, so I don't do it again.
Plan - seq scan of table:
=====
> explain
select orders.orderid FROM orders WHERE (orders.orderid IN ('546111') OR orders.orderid IN (select orderid FROM orderstotrans WHERE (transid IN ('546111'))));
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on orders (cost=8.45..486270.87 rows=4302781 width=8)
Filter: ((orderid = '546111'::bigint) OR (hashed SubPlan 1))
SubPlan 1
-> Index Scan using orderstotrans_transid_key on orderstotrans (cost=0.43..8.45 rows=1 width=8)
Index Cond: (transid = '546111'::bigint)
(5 rows)
=====
Plan - Seq scan and filter of index:
=====
> explain select orders.orderid FROM orders WHERE (orders.orderid IN ('546111') OR orders.orderid IN (select orderid FROM orderstotrans WHERE (transid IN ('546111'))));
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Index Only Scan using orders_pkey on orders (cost=9.16..4067888.60 rows=64760840 width=8)
Filter: ((orderid = '546111'::bigint) OR (hashed SubPlan 1))
SubPlan 1
-> Index Scan using orderstotrans_transid_key on orderstotrans (cost=0.57..8.59 rows=1 width=8)
Index Cond: (transid = '546111'::bigint)
(5 rows)
=====
Tables:
=====
Table "test.orders"
Column | Type | Collation | Nullable | Default
----------------------+-----------------------------+-----------+----------+--------------
orderid | bigint | | not null |
istest | smallint | | not null | 0
orderstatusid | integer | | |
customername | text | | |
customeraddress | text | | |
customercountry | text | | |
customercity | text | | |
customerstate | text | | |
customerzip | text | | |
"orders_pkey" PRIMARY KEY, btree (orderid)
Table "test.orderstotrans"
Column | Type | Collation | Nullable | Default
-------------+---------+-----------+----------+---------
orderid | bigint | | |
transid | bigint | | |
orderitemid | integer | | |
Indexes:
"orderstotrans_orderid_idx" btree (orderid)
"orderstotrans_orderitemid_idx" btree (orderitemid)
"orderstotrans_transid_key" UNIQUE, btree (transid)
Happier plan for the union version:
====
explain select orders.orderid FROM orders
WHERE (
orders.orderid IN (
SELECT '3131275553'
UNION
select orderid FROM orderstotrans WHERE (transid IN ('3131275553'))
)
);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=9.21..21.84 rows=2 width=8) (actual time=0.034..0.043 rows=1 loops=1)
-> Unique (cost=8.64..8.65 rows=2 width=8) (actual time=0.024..0.026 rows=2 loops=1)
-> Sort (cost=8.64..8.64 rows=2 width=8) (actual time=0.023..0.024 rows=2 loops=1)
Sort Key: ('3131275553'::bigint)
Sort Method: quicksort Memory: 25kB
-> Append (cost=0.00..8.63 rows=2 width=8) (actual time=0.001..0.019 rows=2 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=1)
-> Index Scan using orderstotrans_transid_key on orderstotrans (cost=0.57..8.59 rows=1 width=8) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (transid = '3131275553'::bigint)
-> Index Only Scan using orders_pkey on orders (cost=0.57..6.58 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2)
Index Cond: (orderid = ('3131275553'::bigint))
Heap Fetches: 0
Planning Time: 0.165 ms
Execution Time: 0.065 ms
(14 rows)
====
(though that plan is a bit misleading, as that index condition isn't exactly what is used, ie with:
select orders.orderid FROM orders
WHERE (
orders.orderid IN (
SELECT '3131275553'
UNION
select orderid FROM orderstotrans WHERE (transid IN ('3131275553'))
)
);
orderid
-----------
439155713
(1 row)
the orderid it matches, isn't the one the planner showed, but it works)