Interesting query plan change linked to the LIMIT parameter

"Yannick Le Guédart" <yannick@xxxxxxxxxxxxx> · Tue, 20 Jan 2009 16:45:45 +0100

Greetings,

I'm experiencing a strange query plan change on a "simple" request, based on the LIMIT parameter. I present here two tables, named _article and _comment. they are part of a much larger database.

It's a tree-like database (making use of the ltree to keep the coherence), and in our case, an article can have any number of comments (linked by _article.id = _comment.parent_id), and a _comment can itself have ONE comment (at most). 

The _article table contains 12 millions t-uples, and the _comment table around 26 millions. The server runs a postgresql 8.3.5 in its 64bits version.

Here are the tables definition :

-- _article table

CREATE TABLE "ob2"."_article" (

    "id" bigint NOT NULL DEFAULT nextval('element_id_sequence'::regclass),
    "parent_id" bigint,

    "path" ltree,
    "data" text,

    "date_creation" timestamp without time zone NOT NULL DEFAULT now(),
    "date_publishing" timestamp without time zone NOT NULL DEFAULT now(),

    "date_modification" timestamp without time zone NOT NULL DEFAULT now(),
    "counters" hstore,

    "reference" integer NOT NULL DEFAULT nextval('_article_reference_seq'::regclass),
    "title" character varying NOT NULL,

    "text" text,
    "blog_id" bigint,

    "user_id" bigint,
    "site_id" bigint,

    "topic_id" bigint,
    "community_id" bigint,

    CONSTRAINT "_article_pkey" PRIMARY KEY (id)
) WITHOUT OIDS;

ALTER TABLE ONLY "ob2"."_article" ALTER COLUMN "path" SET STORAGE PLAIN;
ALTER TABLE ONLY "ob2"."_article" ALTER COLUMN "title" SET STORAGE PLAIN;

-- Indexes

CREATE UNIQUE INDEX _article_pkey ON _article USING btree (id);

CREATE INDEX gist_idx_article_path ON _article USING gist (path);
CREATE INDEX idx_article_blog_id ON _article USING btree (blog_id);

CREATE INDEX idx_article_community_id ON _article USING btree (community_id);
CREATE INDEX idx_article_date_creation ON _article USING btree (date_creation);

CREATE INDEX idx_article_date_modification ON _article USING btree (date_modification);
CREATE INDEX idx_article_date_publishing ON _article USING btree (date_publishing);

CREATE INDEX idx_article_parent_id ON _article USING btree (parent_id);
CREATE UNIQUE INDEX idx_article_reference_unique ON _article USING btree (reference);

CREATE INDEX idx_article_site_id ON _article USING btree (site_id);
CREATE INDEX idx_article_topic_id ON _article USING btree (topic_id);

CREATE INDEX idx_article_user_id ON _article USING btree (user_id);

-- _comment table

CREATE TABLE "ob2"."_comment" (
    "id" bigint NOT NULL DEFAULT nextval('element_id_sequence'::regclass),

    "parent_id" bigint,
    "path" ltree,

    "data" text,
    "date_creation" timestamp without time zone NOT NULL DEFAULT now(),

    "date_publishing" timestamp without time zone NOT NULL DEFAULT now(),
    "date_modification" timestamp without time zone NOT NULL DEFAULT now(),

    "counters" hstore,
    "reference" integer NOT NULL DEFAULT nextval('_comment_reference_seq'::regclass),

    "text" text,
    "article_id" bigint,

    "blog_id" bigint,
    "user_id" bigint,

    "site_id" bigint,
    CONSTRAINT "_comment_pkey" PRIMARY KEY (id)

) WITHOUT OIDS;
ALTER TABLE ONLY "ob2"."_comment" ALTER COLUMN "path" SET STORAGE PLAIN;

-- Indexes

CREATE UNIQUE INDEX _comment_pkey ON _comment USING btree (id);

CREATE INDEX gist_idx_comment_path ON _comment USING gist (path);
CREATE INDEX idx_comment_date_creation ON _comment USING btree (date_creation);

CREATE INDEX idx_comment_date_publishing ON _comment USING btree (date_publishing);
CREATE INDEX idx_comment_parent_id ON _comment USING btree (parent_id);

CREATE INDEX idx_comment_reference ON _comment USING btree (reference);

Now I created a function to get simply the comment reply to a given comment :

CREATE OR REPLACE FUNCTION get_comment_response (BIGINT) RETURNS _comment AS

$$
    SELECT * FROM _comment WHERE parent_id = $1;

$$
    STABLE

    COST 1
    LANGUAGE SQL;

Ok, now, all is set. I'd like to get with a simple query the comments of a given article, ordered by publishing date, as well as their replies if they exists. So I write this request :

SELECT 
    _comment.id,
    (get_comment_response(_comment.id)).id AS r_id

FROM   _comment
INNER JOIN _article

    ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'

ORDER BY _comment.date_publishing ASC
OFFSET 0

LIMIT 10;
Results are good, quite fast, BUT, when executing tests I discovered something very strange. The query was fast for 3+ comments, but very slow with a limit of 1 or 2 ! Just because the query plan change :

EXPLAIN 

SELECT _comment.id,
        (get_comment_response(_comment.id)).id AS r_id

FROM   _comment
INNER JOIN _article

        ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'

ORDER BY _comment.id ASC
OFFSET 0

LIMIT 1000;

                                                  QUERY PLAN                                                   

---------------------------------------------------------------------------------------------------------------
 Limit  (cost=10261.19..10263.69 rows=1000 width=8)

   ->  Sort  (cost=10261.19..10281.06 rows=7949 width=8)
         Sort Key: _comment.id

         ->  Nested Loop  (cost=0.00..9825.35 rows=7949 width=8)
               ->  Index Scan using _article_pkey on _article  (cost=0.00..9.55 rows=1 width=8)

                     Index Cond: (id = 17355952::bigint)
               ->  Index Scan using idx_comment_parent_id on _comment  (cost=0.00..9716.44 rows=7949 width=16)

                     Index Cond: (_comment.parent_id = 17355952::bigint)
(8 rows)

EXPLAIN 

SELECT _comment.id,
        (get_comment_response(_comment.id)).id AS r_id

FROM   _comment
INNER JOIN _article

        ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'

ORDER BY _comment.id ASC
OFFSET 0

LIMIT 1;
                                             QUERY PLAN                                              

-----------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..3588.42 rows=1 width=8)

   ->  Nested Loop  (cost=0.00..28524312.40 rows=7949 width=8)
         ->  Index Scan using _comment_pkey on _comment  (cost=0.00..28448324.73 rows=7949 width=16)

               Filter: (parent_id = 17355952::bigint)
         ->  Index Scan using _article_pkey on _article  (cost=0.00..9.55 rows=1 width=8)

               Index Cond: (_article.id = 17355952::bigint)
(6 rows)

The second query scans the whole comment table which is very dangerous for production servers. 

So did I do something wrong ? Is there a way to handle this issue smoothly ?

Thanks in advance

Yannick