Good day,
I have a recursive CTE
where a table scan occurs, even though there doesn't seem to be
a good reason for it.
It seems the planner
came to the conclusion that columns that are not actually used
in the output, joins or a where clause are a part of the output.
It's not a performance
problem now and the query runs quickly (which is why I haven't
posted it on the performance mailing list). What bothers me is
that the scan seems to be there without being really necessary
and I would like to avoid any extra I/O. I would expect the plan
to be close the simplified query without CTE posted after the
CTE query below.
I'm also worried that
this could get worse over time and that it might influence the
query's performance.
The question:
Am I missing something
and the table scan is necessary?
If not, is there a way
to avoid it?
This is the query:
WITH RECURSIVE
user_subordinates
AS
(
SELECT
ur1."id"
AS var_id,
ur1.id_user_parent
AS var_id_user_parent,
ur1.login
AS var_login,
ur1_1.login
AS var_login_parent,
1::smallint
AS var_sort_order
FROM
"user"
ur1
JOIN
"user"
ur1_1
ON(ur1.id_user_parent=ur1_1."id")
WHERE
ur1.id
= 3970
AND
ur1.disabled=false
UNION
ALL
SELECT
ur2."id"
AS var_id,
ur2.id_user_parent
AS var_id_user_parent,
ur2.login
AS var_login,
ups1.var_login
AS var_login_parent,
(ups1.var_sort_order
+ 1)::smallint AS var_sort_order
FROM
user_subordinates
ups1
JOIN
"user"
ur2
ON( ups1.var_id
= ur2.id_user_parent
AND
ups1.var_id <> ur2.id
)
)
SELECT
var_id,
var_id_user_parent,
var_login,
var_login_parent,
var_sort_order
FROM
user_subordinates
ORDER BY
var_sort_order,
var_id,
var_id_user_parent
;
This is its execution
plan:
http://explain.depesz.com/s/4hY
This is a simplified
version of the query without CTE:
SELECT
ur2."id"
AS var_id,
ur2.id_user_parent
AS var_id_user_parent,
ur2.login
AS var_login,
ups1.var_login
AS var_login_parent,
(ups1.var_sort_order
+ 1)::smallint AS var_sort_order
FROM
(
SELECT
ur1."id"
AS var_id,
ur1.id_user_parent
AS var_id_user_parent,
ur1.login
AS var_login,
ur1_1.login
AS var_login_parent,
1::smallint
AS var_sort_order
FROM
"user"
ur1
JOIN
"user"
ur1_1
ON(ur1.id_user_parent=ur1_1."id")
WHERE
ur1.id
= 3970
AND
ur1.disabled=false
)
ups1
JOIN
"user"
ur2
ON( ups1.var_id
= ur2.id_user_parent
AND
ups1.var_id <> ur2.id
)
;
Its plan is here:
http://explain.depesz.com/s/Ak3
Here's the table's
DDL:
CREATE TABLE "user"
(
id
bigserial NOT NULL, -- Unique row identifier
id_user_parent
bigint NOT NULL DEFAULT 3, -- Identifier of user's parent user
id_address
bigint NOT NULL,
login
character varying NOT NULL, -- Login name of a user
password
character varying NOT NULL, -- Login password of a user
date_created
timestamp without time zone NOT NULL DEFAULT now(), -- Date and
time at which the record was created
id_user_created
bigint NOT NULL, -- Identifier of a user who created the row
date_modified
timestamp without time zone, -- Date and time at which the
record was modified
id_user_modified
bigint, -- Identifier of a user who modified the row
disabled
boolean NOT NULL DEFAULT true, -- Its value is set to true when
user account is disabled, false when enabled.
activation_hash
character varying NOT NULL, -- Activation hash that a hashed
string sent to the user matches against during activation.
invoiced
boolean NOT NULL DEFAULT false, -- True is user wishes to be
invoiced, false otherwise.
CONSTRAINT
user_pkey PRIMARY KEY (id),
CONSTRAINT
user_id_address_fkey FOREIGN KEY (id_address) REFERENCES address
(id),
CONSTRAINT
user_id_user_created_fkey FOREIGN KEY (id_user_created)
REFERENCES "user" (id),
CONSTRAINT
user_id_user_modified_fkey FOREIGN KEY (id_user_modified)
REFERENCES "user" (id),
CONSTRAINT
u_login UNIQUE (login)
);
CREATE INDEX fki_user_id_address_fkey
ON "user" USING btree(id_address);
CREATE INDEX ix__user__id_user_parent
ON "user" USING btree(id_user_parent);
CREATE UNIQUE INDEX ix__user__login
ON "user" USING btree (lower(login::text) COLLATE
pg_catalog."default");
CREATE UNIQUE INDEX
ix__user__login__varchar_pattern_ops ON "user" USING btree
(lower(login::text) COLLATE pg_catalog."default"
varchar_pattern_ops);
Environment: Windows 7
Professional x64, PostgreSQL 9.3.1. The table has been analysed
before executing the query and getting the explain result. It's
on my workstation and nobody else uses the database but me.
Any thought on this
are appreciated.
Thank you.
Peter Slapansky
You can "set enable_seqscan = off" to see what the planner thinks of
using the index on id_user_parent.