On 01/31/2011 08:48 PM, Tom Lane wrote:
Nicos Panayides<nicos@xxxxxxxxxxxxxx> writes:
the following function takes forever to execute as is. I see 'Inserting
original actions in temporary table' and nothing after that. If i
replace orig_user_id in the FOR loop with 1811 (the same orig_user_id
passed as the function parameter) it returns immediately correctly (the
table has indices so it's very fast).
It seems likely that you're getting a different plan for the generic
case because that user id isn't representative of the overall average
for the column. You could investigate by explaining a parameterized
query:
PREPARE foo (bigint) AS
SELECT ... WHERE game_round_actions.user_id = $1 ... ;
EXPLAIN EXECUTE foo(1811);
(To really see exactly what's happening, you'd probably need to
parameterize for each of the plpgsql variables used in the query;
I'm suspicious that the BETWEEN might be contributing to the
issue as well.)
Possibly increasing the stats target for the user id column would help,
but it's hard to be sure without knowing what its distribution is like.
regards, tom lane
I tried the prepared statement with both $1 and 1811 for user_id and
here's the plans I got:
"Sort (cost=51704688.71..51704689.50 rows=314 width=57)"
" Sort Key: game_round_actions.action_time, game_round_actions.action_id"
" -> Nested Loop (cost=0.00..51704675.69 rows=314 width=57)"
" -> Seq Scan on game_round_actions (cost=0.00..51702078.26
rows=314 width=53)"
" Filter: ((action_time >= $2) AND (action_time <= $3) AND
(sub_action_id = 0) AND (user_id = $1))"
" -> Index Scan using "PK_game_table" on game_tables
(cost=0.00..8.26 rows=1 width=12)"
" Index Cond: (game_tables.table_id =
game_round_actions.table_id)"
" Filter: (game_tables.game_type_id <> ANY ($4))"
"Sort (cost=226660.58..226661.33 rows=300 width=57)"
" Sort Key: game_round_actions.action_time, game_round_actions.action_id"
" -> Nested Loop (cost=0.00..226648.24 rows=300 width=57)"
" -> Index Scan using i_session on game_round_actions
(cost=0.00..224166.97 rows=300 width=53)"
" Index Cond: ((action_time >= $2) AND (action_time <= $3))"
" Filter: (user_id = 1811)"
" -> Index Scan using "PK_game_table" on game_tables
(cost=0.00..8.26 rows=1 width=12)"
" Index Cond: (game_tables.table_id =
game_round_actions.table_id)"
" Filter: (game_tables.game_type_id <> ANY ($4))"
Here's the table definition:
CREATE TABLE game_round_actions
(
table_id bigint NOT NULL,
round_id integer NOT NULL,
action_id integer NOT NULL,
seat_id integer NOT NULL,
action_desc character varying(20) NOT NULL,
action_area character varying(100),
amount numeric(16,6),
action_value character varying(100),
action_time timestamp without time zone NOT NULL DEFAULT
CURRENT_TIMESTAMP AT TIME ZONE 'UTC',
user_id bigint,
sub_action_id integer NOT NULL, -- Sub action id is 0 for the root
actions. >0 for generated actions.
CONSTRAINT "PK_game_round_actions" PRIMARY KEY (table_id, round_id,
action_id, sub_action_id),
CONSTRAINT fk_game_round_actions_round FOREIGN KEY (table_id, round_id)
REFERENCES game_rounds (table_id, round_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT fk_game_round_actions_user FOREIGN KEY (table_id, user_id)
REFERENCES game_table_users (table_id, user_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
CREATE INDEX i_session
ON game_round_actions
USING btree
(action_time)
WHERE user_id <> 0 AND sub_action_id = 0;
The table contains 1 655 528 000 rows (estimated) and there are about
10000 unique user_ids. The data spans about 2 years.
Shouldn't postgres realise that in both cases user_id is compared
against a constant value and chose the same plan?
How do I increase the stats target for the column?
--
Regards,
Nicos Panayides
IT Manager
Magneta Technologies Ltd
Tel: +357 22721919, 22317400
Fax: +357 22721917
Web: http://www.magneta.eu
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general