Re: Can I do better than this heapscan and sort?

Merlin Moncure <mmoncure@xxxxxxxxx> · Tue, 26 Jun 2012 08:42:34 -0500

On Tue, Jun 26, 2012 at 8:36 AM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
> On Thu, Jun 21, 2012 at 3:07 PM, Andy Halsall <halsall_andy@xxxxxxxxxxx> wrote:
>> I have two tables node and relationship. Each relationship record connects
>> two nodes and has an application keys (unfortunately named) that can be used
>> by the application to look-up a relationship and get from one node to the
>> other.
>>
>> My query uses a node id and a description of a relationship from the node,
>> and must find the "next" relationship that the node has. It does this by
>> finding all the relationships that could be "next", ordering them and then
>> getting the first.
>>
>> Details are below but I end up with 6896 candidates for "next".
>>
>> If I'm reading the output correctly it takes 13.509 ms to apply the filter
>> and another 7 ms or so to do the sort of the remaining 6896 nodes.
>>
>> Have tried many index combinations to try and improve the results. I suspect
>> that with so many nodes to sort, postgresql will opt for heap scan rather
>> than index. But why does it not use the IDX_order_sort_down_2 index for the
>> sort?
>>
>> Thanks,
>> Andy
>>
>>
>> Details..........
>>
>> Version
>> -------
>> PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.5.2,
>> 64-bit
>>
>> Tables
>> ------
>> CREATE TABLE node (
>>  node_id      bigint NOT NULL,
>>  node_type    int4 NOT NULL,
>>  c_state      int4 NOT NULL,
>>  d_state      int4 NOT NULL,
>>  sort_key     bigint NOT NULL,
>>  permissions  bytea NOT NULL,
>>  audit        bytea  NOT NULL,
>>  pkg_id       bytea NULL,
>>  created      timestamp NOT NULL
>> );
>>
>> CREATE TABLE relationship (
>>  rel_id           bigint NOT NULL,
>>  rel_type         integer NOT NULL,
>>  s_t_n            bigint NOT NULL,
>>  t_s_n            bigint NOT NULL,
>>  state            integer NOT NULL,
>>  control          integer NOT NULL,
>>  sort_key         bigint NOT NULL,
>>  prime_key        bytea NULL,
>>  prime_key_len    integer NOT NULL,
>>  sec_key          bytea NULL,
>>  sec_key_len      integer NOT NULL,
>>  up_sort_key      bigint NOT NULL,
>>  up_prime_key     bytea NULL,
>>  up_prime_key_len integer NOT NULL,
>>  up_sec_key       bytea NULL,
>>  up_sec_key_len   integer NOT NULL,
>>  permissions      bytea NOT NULL,
>>  t_s_n_type       integer NOT NULL,
>>  created          timestamp NOT NULL
>> );
>>
>> Constraints
>> -----------
>> -- Primary keys
>> ALTER TABLE node ADD CONSTRAINT PK_node PRIMARY KEY (node_id);
>>
>> ALTER TABLE relationship ADD CONSTRAINT PK_relationship PRIMARY KEY
>> (rel_id);
>>
>> -- Foreign keys
>> ALTER TABLE relationship ADD CONSTRAINT FK_node_s FOREIGN KEY (s_t_n)
>> REFERENCES node (node_id);
>>
>> ALTER TABLE relationship ADD CONSTRAINT FK_node_n FOREIGN KEY (t_s_n)
>> REFERENCES node (node_id);
>>
>>
>> Indexes
>> -------
>> CREATE INDEX IDX_node_type ON node (node_type ASC) TABLESPACE ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_node_sort_key ON node (sort_key ASC) TABLESPACE
>> ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_relationship_s_t_n ON relationship (s_t_n ASC) TABLESPACE
>> ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_relationship_t_s_n ON relationship (t_s_n ASC) TABLESPACE
>> ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_relationship_type ON relationship (rel_type ASC) TABLESPACE
>> ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_relationship_prime_key ON relationship (prime_key ASC)
>> TABLESPACE ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_relationship_u_prime_key ON relationship (up_prime_key ASC)
>> TABLESPACE ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_relationship_sec_key ON relationship (sec_key ASC)
>> TABLESPACE ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_order_first ON node(sort_key DESC, node_id DESC) TABLESPACE
>> ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_order_sort_down_1 ON relationship(sort_key DESC, prime_key
>> ASC NULLS FIRST, sec_key ASC NULLS FIRST) TABLESPACE ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_order_sort_down_2 ON relationship(sort_key DESC, prime_key
>> ASC NULLS FIRST, sec_key DESC NULLS FIRST) TABLESPACE ds_appex_ts_10
>> ;
>> CREATE INDEX IDX_order_sort_up ON relationship(up_sort_key DESC,
>> up_prime_key ASC NULLS FIRST, up_sec_key ASC NULLS FIRST) TABLESPACE
>> ds_appex_ts_10
>> ;
>>
>> Query
>> -----
>> CREATE OR REPLACE FUNCTION sp_get_rel_sort_dup_sec_desc(in_rel_type1
>> integer, in_rel_type2 integer, in_node_type integer, in_own_guid bigint,
>> in_prev_prime_key bytea, in_prev_prime_key_len integer, in_prev_sec_key
>> bytea, in_prev_sec_key_len integer, in_prev_sort_key bigint, in_ctrl
>> integer) RETURNS select_rel_holder AS
>> '
>> declare
>> h select_rel_holder%rowtype;
>>
>> begin
>>        SELECT INTO h  r.rel_id, r.t_s_n, r.rel_type, r.sort_key,
>>                       r.state,r.permissions, r.control,
>>                       r.prime_key, r.prime_key_len, r.sec_key,
>> r.sec_key_len,
>>                       r.up_prime_key, r.up_prime_key_len, r.up_sec_key,
>> r.up_sec_key_len
>>         FROM relationship r
>>         WHERE r.s_t_n = in_own_guid AND (r.rel_type = in_rel_type1 OR
>> r.rel_type = in_rel_type2)
>>         AND
>>         (
>>             (
>>                (
>>                   r.prime_key > in_prev_prime_key
>>                   OR
>>                   ( r.prime_key = in_prev_prime_key AND r.sec_key <
>> in_prev_sec_key)
>>                )
>>                AND
>>                r.sort_key = in_prev_sort_key
>>             )
>>
>>             OR
>>             r.sort_key < in_prev_sort_key
>>         )
>>         AND t_s_n_type = in_node_type
>>         AND r.control >= in_ctrl
>>         ORDER BY sort_key DESC, prime_key ASC NULLS FIRST, sec_key DESC
>> NULLS FIRST LIMIT 1;
>>         RETURN h;
>> end
>> '
>> language 'plpgsql' STABLE;
>>
>>
>> EXPLAIN ANALYZE output
>> -------------------------------
>>         Limit  (cost=48.90..48.90 rows=1 width=89) (actual
>> time=21.480..21.480 rows=1 loops=1)
>>           Output: rel_id, t_s_n, rel_type, sort_key, state, permissions,
>> control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key,
>> up_prime_key_l
>> en, up_sec_key, up_sec_key_len
>>
>>           ->  Sort  (cost=48.90..48.90 rows=1 width=89) (actual
>> time=21.479..21.479 rows=1 loops=1)
>>                 Output: rel_id, t_s_n, rel_type, sort_key, state,
>> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len,
>> up_prime_key, up_prime
>> _key_len, up_sec_key, up_sec_key_len
>>                 Sort Key: r.sort_key, r.prime_key, r.sec_key
>>                 Sort Method: top-N heapsort  Memory: 25kB
>>
>>                 ->  Bitmap Heap Scan on public.relationship r
>> (cost=3.39..48.89 rows=1 width=89) (actual time=1.034..13.509 rows=6986
>> loops=1)
>>                       Output: rel_id, t_s_n, rel_type, sort_key, state,
>> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len,
>> up_prime_key, up
>> _prime_key_len, up_sec_key, up_sec_key_len
>>                       Recheck Cond: (r.s_t_n = $4)
>>                       Filter: ((r.control >= $10) AND (r.t_s_n_type = $3)
>> AND ((r.rel_type = $1) OR (r.rel_type = $2)) AND ((((r.prime_key > $5) OR
>> ((r.prime_
>> key = $5) AND (r.sec_key < $7))) AND (r.sort_key = $9)) OR (r.sort_key <
>> $9)))
>>
>>                       ->  Bitmap Index Scan on idx_relationship_s_t_n
>> (cost=0.00..3.39 rows=18 width=0) (actual time=0.951..0.951 rows=6989
>> loops=1)
>>                             Index Cond: (r.s_t_n = $4)
>
> Absolutely.  You need to learn and master row-wise comparison.  It was
> added for exactly this purpose :-).
>
> SELECT * FROM foo WHERE (a,b,c) > (a1,b1,c1) ORDER BY a,b,c LIMIT k;
>
> will be fully optimized if you have an index on a,b,c (a1,b1,c1 are
> the last ones you read off).  Be advised that if there is not a lot of
> cardinality on 'a', you may need to disable certain index plans to get
> a good plan in some cases.

hm, one more point: I notice you are mixing ASC/DESC in the index
definition.  Try to avoid doing that: it will make index based paging
of the table more difficult.  If you have to, try transforming the
values so that you can index all the fields ASC or DESC.  This will
also fit easier into row-wise comparisons strategy although it will
slow down insertion a bit.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance