Tom - Thanks for the response. I guess what I am really looking for is a simple way to find all of the columns referenced from a given instance of a table or index from the json file, although it would be even better if it was easy to differentiate between the columns that came from the index vs those that could only come from the table. (We may not have direct access to the database, the indexing may have changed since the plan was captured, ...) I can see that all of the values in "Index Cond" and "Filter" for the given "Alias" are relevant, but it's unclear what portion of the values in "Output" are relevant. Some instances of "Output" contain a superset of the values in "Index Cond" and "Filter", including columns that are not referenced in the query (there are a total of 187 columns in pc_policyperiod and instances in the plan were all of them show up in "Output" but only 7 of them are actually referenced), others contain a mutually exclusive set of values, ... It would be helpful if there was an attribute that contained that information.
I now see that groot2.ID is the only column listed in "Output" from pc_policy here and it doesn't show up anywhere else in the plan. It's actually used in the evaluation of "(hashed SubPlan 3)", but I had to look at the SQL to figure that out.
NOTE: policy_n_producerco_3e8i0ojsyckhx is an index on pc_policy(producercodeofserviceid, retired). It makes sense for reasons beyond this query to add id as the last key column to the index.
"Filter": "((NOT groot_1.assignedrisk) AND ((groot_1.producercodeofrecordid = '10791'::bigint) OR (hashed SubPlan 3)))",
"Rows Removed by Filter": 0,
"Shared Hit Blocks": 549472,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0,
"I/O Read Time": 0.000,
"I/O Write Time": 0.000,
"Plans": [
{
"Node Type": "Index Scan",
"Parent Relationship": "SubPlan",
"Subplan Name": "SubPlan 3",
"Parallel Aware": false,
"Scan Direction": "Forward",
"Index Name": "policy_n_producerco_3e8i0ojsyckhx",
"Relation Name": "pc_policy",
"Schema": "public",
"Alias": "groot2",
"Startup Cost": 0.56,
"Total Cost": 484540.46,
"Plan Rows": 578767,
"Plan Width": 8,
"Actual Startup Time": 0.035,
"Actual Total Time": 490.349,
"Actual Rows": 546045,
"Actual Loops": 1,
"Output": ["groot2.id"],
"Index Cond": "((groot2.producercodeofserviceid = '10791'::bigint) AND (groot2.retired = 0))",
Here's the SQL:
SELECT COUNT(*)
FROM (
SELECT *
FROM (
SELECT /* ISNULL:pc_policycontactrole.EffectiveDate:, ISNULL:pc_policycontactrole.ExpirationDate:, pc:gw.webservice.pc.pc5000.policysearch.PolicySearchAPI#findPolicies_WMIC; */ gRoot.ID col0
FROM pc_policyperiod gRoot
WHERE gRoot.AssignedRisk = $1 AND gRoot.MostRecentModel = $2 AND gRoot.PolicyID IN
(
SELECT qRoots0.PolicyID col0
FROM pc_policyperiod qRoots0
WHERE qRoots0.ID = ANY (ARRAY
(
SELECT qRoots1.BranchID col0
FROM pc_policycontactrole qRoots1
WHERE qRoots1.Subtype = $3 AND qRoots1.ContactDenorm IN
(
SELECT qRoots2.ID col0
FROM pc_contact qRoots2
WHERE qRoots2.FirstNameDenorm = LOWER ($4) AND qRoots2.LastNameDenorm = LOWER ($5) AND qRoots2.Retired = 0)
AND ( ( ( (qRoots1.EffectiveDate <> qRoots1.ExpirationDate) OR (qRoots1.EffectiveDate IS NULL) OR (qRoots1.ExpirationDate IS NULL))))))
AND qRoots0.Retired = 0 AND qRoots0.TemporaryBranch = false)
AND gRoot.Retired = 0 AND gRoot.TemporaryBranch = false AND ( ( ( (gRoot.ProducerCodeOfRecordID = $6) OR (gRoot.PolicyID IN
(
SELECT gRoot3.ID col0
FROM pc_policy gRoot3
WHERE gRoot3.ProducerCodeOfServiceID = $7 AND gRoot3.Retired = 0)))))
UNION
SELECT /* ISNULL:pc_policycontactrole.EffectiveDate:, ISNULL:pc_policycontactrole.ExpirationDate:, pc:gw.webservice.pc.pc5000.policysearch.PolicySearchAPI#findPolicies_WMIC; */ gRoot.ID col0
FROM pc_policyperiod gRoot
WHERE gRoot.AssignedRisk = $8 AND gRoot.MostRecentModel = $9 AND gRoot.PolicyID IN
(
SELECT qRoots0.PolicyID col0
FROM pc_policyperiod qRoots0
WHERE qRoots0.ID = ANY (ARRAY
(
SELECT qRoots1.BranchID col0
FROM pc_policycontactrole qRoots1
WHERE qRoots1.Subtype = $10 AND qRoots1.FirstNameInternalDenorm = LOWER ($11) AND qRoots1.LastNameInternalDenorm = LOWER ($12)
AND ( ( ( (qRoots1.EffectiveDate <> qRoots1.ExpirationDate) OR (qRoots1.EffectiveDate IS NULL) OR (qRoots1.ExpirationDate IS NULL))))))
AND qRoots0.Retired = 0 AND qRoots0.TemporaryBranch = false) AND gRoot.Retired = 0 AND gRoot.TemporaryBranch = false AND ( ( ( (gRoot.ProducerCodeOfRecordID = $13) OR (gRoot.PolicyID IN
(
SELECT gRoot2.ID col0
FROM pc_policy gRoot2
WHERE gRoot2.ProducerCodeOfServiceID = $14 AND gRoot2.Retired = 0)))))) a
FETCH FIRST 301 ROWS ONLY) countTable
I've attached the full json because it is too big to paste (and "Output" doesn't show up in the text output of the tools that I've looked at).
Thanks,
Jerry
On Tue, Jan 2, 2024 at 11:23 AM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> On Tue, Jan 2, 2024 at 1:29 PM Jerry Brenner <jbrenner@xxxxxxxxxxxxx> wrote:
>> - It looks like "Output" includes more than just the columns with
>> predicates and/or being accessed or returned in other nodes.
> Not in my hands. For SELECTs it just lists the columns that are needed.
It depends. The planner may choose to tell a non-top-level scan node
to return all columns, in hopes of saving a tuple projection step at
runtime. That's heuristic and depends on a number of factors, so you
shouldn't count on it happening or not happening.
regards, tom lane
Attachment:
explain-subquery-subplan-verbose-on.json
Description: application/json