Re: Does Postgres have consistent identifiers (plan hash value) for explain plans?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Apologies if I'm not using the appropriate Postgres-specific terms.  The simplest implementation would consider only the information that is consistent across systems and executions - node types, relation and index names (not ids), aliases ...  It would ignore bind variable values and execution statistics.  This would make it easy to find:
  • Queries with different plans (we would then investigate the causes)
  • Queries where the plans changed over time (we would then investigate the causes)
  • The aggregated cost of a query plan shared by multiple queries (due to differences in the number of items in an IN list, ...) This is useful when no single query in the group registers high, but the group of queries registers high.
  • Because we use consistent table and index names across environments, this would make it possible to find queries with different plans in different environments.
In a perfect world, the hash would include the filters, index conds, ... with the constant values masked out, but I realize that's much more complicated.  (Without this, we could see plans that are applying different numbers of predicates against an index mapping to the same hash value, but it would still be a big improvement.)

As mentioned before, we are currently storing the explain plans in a database table.  We use a combination of columns and syntax when querying for the long executions to display some contextual information about the explain plans (does it have a Materialize node, Init Plan, Sub Plan, ...)  Based on that little contextual information, we can see that there are multiple plans for some queries. Based on manual investigation, we know that there other plan differences.  It is very expensive right now to try to figure out if a plan is changing over time, why some executions are more expensive than others, ...  

Thanks,
Jerry

On Mon, Dec 4, 2023 at 7:29 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Michael Paquier <michael@xxxxxxxxxxx> writes:
> On Mon, Dec 04, 2023 at 09:57:24AM -0500, Tom Lane wrote:
>> Jerry Brenner <jbrenner@xxxxxxxxxxxxx> writes:
>>> Both Oracle and SQL Server have
>>> consistent hash values for query plans and that makes it easy to identify
>>> when there are multiple plans for the same query.  Does that concept exist
>>> in later releases of Postgres (and is the value stored in the json explain
>>> plan)?

>> No, there's no support currently for obtaining a hash value that's
>> associated with a plan rather than an input query tree.

> PlannerGlobal includes a no_query_jumble that gets inherited by all
> its lower-level nodes, so adding support for hashes compiled from
> these node structures would not be that complicated.  My point is that
> the basic infrastructure is in place in the tree to be able to do
> that, and it should not be a problem to even publish the compiled
> hashes in EXPLAIN outputs, behind an option of course.

Well, yeah, we could fairly easily activate that infrastructure for
plans, but we haven't.  More to the point, it's not clear to me that
that would satisfy the OP's request for "consistent" hash values.
The hashes would vary depending on object OID values, system version,
possibly endianness, etc.

I'm also wondering exactly what the OP thinks qualifies as different
plans.  Remembering the amount of fooling-around that's gone on with
querytree hashes to satisfy various people's ill-defined desires for
pg_stat_statements aggregation behavior, I'm not really eager to buy
into the same definitional morass at the plan level.

                        regards, tom lane


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux