On 9/5/21 9:35 AM, Richard Michael wrote:
TL;DR --
1/ My basic prepared statement doesn't seem to start out with a custom
plan
(i.e., no parameter symbols in query text).
2/ EXPLAIN EXECUTE of prepared statement doesn't show query text, as the
documentation seems to imply. Should it?
3/ How can I observe the effect of plan_cache_mode?
e.g., Possible plan change from custom to generic
Apologies if this belongs on -novice.
Details --
I am trying to understand `plan_cache_mode`, related to a report of a
performance problem (resolved by changing `auto` to
`force_custom_plan`). I do not have access to the server or data in the
report, so I am trying to reproduce a similar situation myself. I am
using PostgreSQL 12.
My first step is to observe the change in the plan, from custom to
generic, after the first five queries. I have unsuccessfully tried to
do this with both `EXPLAIN` and the `auto_explain` extension.
I would appreciate help trying to understand what's happening and what
I've misunderstood.
(After I see the plan changing, I'll use more complicated data and
queries to investigate different performance scenarios. Eventually, I'm
interested in queries using the extended protocol. I'm unsure if
parse/bind/execute
will go through the same server code path as `PREPARE ... ; EXECUTE ..`.)
The `PREPARE` documentation
(https://www.postgresql.org/docs/12/sql-prepare.html
<https://www.postgresql.org/docs/12/sql-prepare.html>) indicates:
1/ Regarding `plan_cache_mode`, "... the first five executions are
done with custom plans ..."
2/ Regarding `EXPLAIN EXECUTE ...`, "To examine the query plan
PostgreSQL is using for a prepared statement, use EXPLAIN (...). If a
generic plan is in use, it will contain parameter symbols $n, while a
custom plan will have the supplied parameter values substituted into
it."
Using psql, I tried preparing and explaining a very basic query: `SELECT
$1 AS data`.
Note: Wireshark revealed psql uses simple queries (`PREPARE ...`,
`EXPLAIN EXECUTE ...` are `Q` messages), not the extended protocol
(i.e., parse, bind, execute). I mention this because previous list
posts mention some libpq prepared statement functions do not result in
the planning heuristic used by `plan_cache_mode`.
Based on the documentation, I expected the first planned query text to
be: `SELECT 10 AS data`, since it should be a custom plan with
substituted values. However, the query text always contains a parameter
symbol: `SELECT $1 AS data`.
My questions:
1/ Slightly related, the `EXPLAIN EXECUTE(...)` output does not show the
query text, as the documentation seems to suggest it should (so one may
look for parameter symbols). Why not?
(Although, none of the documented EXPLAIN options mentions query text
display. So perhaps it never does?)
```
~# PREPARE one_param AS SELECT $1 AS data;
~# EXPLAIN EXECUTE one_param(10);
QUERY PLAN
-------------------------------------------
Result (cost=0.00..0.01 rows=1 width=32)
(1 row)
```
2/ The query text was logged by `auto_explain`. However, it contains a
parameter symbol; why? Also, why is the logged query a `PREPARE`
statement
(as if a new prepared statement is being created), instead of only the
`SELECT ..` which was executed?
```
LOG: statement: EXPLAIN EXECUTE one_param(10);
LOG: duration: 0.000 ms plan:
Query Text: PREPARE one_param AS SELECT $1 AS data;
Result (cost=0.00..0.01 rows=1 width=32)
Output: '10'::text
```
Let me know if I should post my postgresql.conf and `auto_explain`
settings.
Thanks for any advice,
Richard
Well, some other databases that shall remain unnamed have a thing called
"bind variable peeking". Essentially, the database instance uses the
first set of bind variables to create the execution plan. That is far
from optimal because the first set of the bind variables encountered by
the instance may not be representative of what the vast majority of the
work is comprised of. That particular database also has tricks like
dynamic statistics sampling, cardinality feedback and adaptive plans (if
the database finds a cheaper plan, it can substitute it on the go). All
of that mess results in high degree of unpredictability. With that
particular database, plan stability is like El Dorado: everybody is
looking for it but nobody can find it. There are outlines, baselines and
SQL patches, none of which is quite satisfactory.
Postgres has a different method: it executes the same statement with 5
different sets of bind variables and if it finds a plan that is cheaper
than the generic plan, it caches it and uses it in the future. Caching
is pertinent for one session only. There is no cursor sharing between
sessions. If it doesn't find such a plan, it uses a generic plan with
hard coded values instead of the bind values and table statistics.
People switching from Oracle, me being one of those, frequently make
mistake of using bind variables in Postgres. Sometimes, there is no
choice, for instance if your app uses an application server and an ORM.
Postgres doesn't have the memory structure called "shared pool" and
cursors and compiled procedures are not cached. That means that by using
Postgres, you are deprived of all the fun managing the cursor caching,
shared pool, database statistics and latches (don't ask, please). There
is a small consolation that you get to keep a ton of money in your
pocket. You also can have a stable plan. There are no adaptive plans or
cursor sharing.
Frank Pachot, an Oracle Ace, has an excellent article on the subject:
https://franckpachot.medium.com/postgresql-bind-variable-peeking-fb4be4942252
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com