On Wed, Jun 27, 2012 at 10:40 AM, Andy Halsall <halsall_andy@xxxxxxxxxxx> wrote: > > >> Date: Tue, 26 Jun 2012 08:42:34 -0500 >> Subject: Re: Can I do better than this heapscan and sort? >> From: mmoncure@xxxxxxxxx >> To: halsall_andy@xxxxxxxxxxx >> CC: pgsql-performance@xxxxxxxxxxxxxx > >> >> On Tue, Jun 26, 2012 at 8:36 AM, Merlin Moncure <mmoncure@xxxxxxxxx> >> wrote: >> > On Thu, Jun 21, 2012 at 3:07 PM, Andy Halsall <halsall_andy@xxxxxxxxxxx> >> > wrote: >> >> I have two tables node and relationship. Each relationship record >> >> connects >> >> two nodes and has an application keys (unfortunately named) that can be >> >> used >> >> by the application to look-up a relationship and get from one node to >> >> the >> >> other. >> >> >> >> My query uses a node id and a description of a relationship from the >> >> node, >> >> and must find the "next" relationship that the node has. It does this >> >> by >> >> finding all the relationships that could be "next", ordering them and >> >> then >> >> getting the first. >> >> >> >> Details are below but I end up with 6896 candidates for "next". >> >> >> >> If I'm reading the output correctly it takes 13.509 ms to apply the >> >> filter >> >> and another 7 ms or so to do the sort of the remaining 6896 nodes. >> >> >> >> Have tried many index combinations to try and improve the results. I >> >> suspect >> >> that with so many nodes to sort, postgresql will opt for heap scan >> >> rather >> >> than index. But why does it not use the IDX_order_sort_down_2 index for >> >> the >> >> sort? >> >> >> >> Thanks, >> >> Andy >> >> >> >> >> >> Details.......... >> >> >> >> Version >> >> ------- >> >> PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) >> >> 4.5.2, >> >> 64-bit >> >> >> >> Tables >> >> ------ >> >> CREATE TABLE node ( >> >> node_id bigint NOT NULL, >> >> node_type int4 NOT NULL, >> >> c_state int4 NOT NULL, >> >> d_state int4 NOT NULL, >> >> sort_key bigint NOT NULL, >> >> permissions bytea NOT NULL, >> >> audit bytea NOT NULL, >> >> pkg_id bytea NULL, >> >> created timestamp NOT NULL >> >> ); >> >> >> >> CREATE TABLE relationship ( >> >> rel_id bigint NOT NULL, >> >> rel_type integer NOT NULL, >> >> s_t_n bigint NOT NULL, >> >> t_s_n bigint NOT NULL, >> >> state integer NOT NULL, >> >> control integer NOT NULL, >> >> sort_key bigint NOT NULL, >> >> prime_key bytea NULL, >> >> prime_key_len integer NOT NULL, >> >> sec_key bytea NULL, >> >> sec_key_len integer NOT NULL, >> >> up_sort_key bigint NOT NULL, >> >> up_prime_key bytea NULL, >> >> up_prime_key_len integer NOT NULL, >> >> up_sec_key bytea NULL, >> >> up_sec_key_len integer NOT NULL, >> >> permissions bytea NOT NULL, >> >> t_s_n_type integer NOT NULL, >> >> created timestamp NOT NULL >> >> ); >> >> >> >> Constraints >> >> ----------- >> >> -- Primary keys >> >> ALTER TABLE node ADD CONSTRAINT PK_node PRIMARY KEY (node_id); >> >> >> >> ALTER TABLE relationship ADD CONSTRAINT PK_relationship PRIMARY KEY >> >> (rel_id); >> >> >> >> -- Foreign keys >> >> ALTER TABLE relationship ADD CONSTRAINT FK_node_s FOREIGN KEY (s_t_n) >> >> REFERENCES node (node_id); >> >> >> >> ALTER TABLE relationship ADD CONSTRAINT FK_node_n FOREIGN KEY (t_s_n) >> >> REFERENCES node (node_id); >> >> >> >> >> >> Indexes >> >> ------- >> >> CREATE INDEX IDX_node_type ON node (node_type ASC) TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_node_sort_key ON node (sort_key ASC) TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_relationship_s_t_n ON relationship (s_t_n ASC) >> >> TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_relationship_t_s_n ON relationship (t_s_n ASC) >> >> TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_relationship_type ON relationship (rel_type ASC) >> >> TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_relationship_prime_key ON relationship (prime_key ASC) >> >> TABLESPACE ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_relationship_u_prime_key ON relationship (up_prime_key >> >> ASC) >> >> TABLESPACE ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_relationship_sec_key ON relationship (sec_key ASC) >> >> TABLESPACE ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_order_first ON node(sort_key DESC, node_id DESC) >> >> TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_order_sort_down_1 ON relationship(sort_key DESC, >> >> prime_key >> >> ASC NULLS FIRST, sec_key ASC NULLS FIRST) TABLESPACE ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_order_sort_down_2 ON relationship(sort_key DESC, >> >> prime_key >> >> ASC NULLS FIRST, sec_key DESC NULLS FIRST) TABLESPACE ds_appex_ts_10 >> >> ; >> >> CREATE INDEX IDX_order_sort_up ON relationship(up_sort_key DESC, >> >> up_prime_key ASC NULLS FIRST, up_sec_key ASC NULLS FIRST) TABLESPACE >> >> ds_appex_ts_10 >> >> ; >> >> >> >> Query >> >> ----- >> >> CREATE OR REPLACE FUNCTION sp_get_rel_sort_dup_sec_desc(in_rel_type1 >> >> integer, in_rel_type2 integer, in_node_type integer, in_own_guid >> >> bigint, >> >> in_prev_prime_key bytea, in_prev_prime_key_len integer, in_prev_sec_key >> >> bytea, in_prev_sec_key_len integer, in_prev_sort_key bigint, in_ctrl >> >> integer) RETURNS select_rel_holder AS >> >> ' >> >> declare >> >> h select_rel_holder%rowtype; >> >> >> >> begin >> >> SELECT INTO h r.rel_id, r.t_s_n, r.rel_type, r.sort_key, >> >> r.state,r.permissions, r.control, >> >> r.prime_key, r.prime_key_len, r.sec_key, >> >> r.sec_key_len, >> >> r.up_prime_key, r.up_prime_key_len, r.up_sec_key, >> >> r.up_sec_key_len >> >> FROM relationship r >> >> WHERE r.s_t_n = in_own_guid AND (r.rel_type = in_rel_type1 OR >> >> r.rel_type = in_rel_type2) >> >> AND >> >> ( >> >> ( >> >> ( >> >> r.prime_key > in_prev_prime_key >> >> OR >> >> ( r.prime_key = in_prev_prime_key AND r.sec_key < >> >> in_prev_sec_key) >> >> ) >> >> AND >> >> r.sort_key = in_prev_sort_key >> >> ) >> >> >> >> OR >> >> r.sort_key < in_prev_sort_key >> >> ) >> >> AND t_s_n_type = in_node_type >> >> AND r.control >= in_ctrl >> >> ORDER BY sort_key DESC, prime_key ASC NULLS FIRST, sec_key DESC >> >> NULLS FIRST LIMIT 1; >> >> RETURN h; >> >> end >> >> ' >> >> language 'plpgsql' STABLE; >> >> >> >> >> >> EXPLAIN ANALYZE output >> >> ------------------------------- >> >> Limit (cost=48.90..48.90 rows=1 width=89) (actual >> >> time=21.480..21.480 rows=1 loops=1) >> >> Output: rel_id, t_s_n, rel_type, sort_key, state, >> >> permissions, >> >> control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key, >> >> up_prime_key_l >> >> en, up_sec_key, up_sec_key_len >> >> >> >> -> Sort (cost=48.90..48.90 rows=1 width=89) (actual >> >> time=21.479..21.479 rows=1 loops=1) >> >> Output: rel_id, t_s_n, rel_type, sort_key, state, >> >> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len, >> >> up_prime_key, up_prime >> >> _key_len, up_sec_key, up_sec_key_len >> >> Sort Key: r.sort_key, r.prime_key, r.sec_key >> >> Sort Method: top-N heapsort Memory: 25kB >> >> >> >> -> Bitmap Heap Scan on public.relationship r >> >> (cost=3.39..48.89 rows=1 width=89) (actual time=1.034..13.509 rows=6986 >> >> loops=1) >> >> Output: rel_id, t_s_n, rel_type, sort_key, state, >> >> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len, >> >> up_prime_key, up >> >> _prime_key_len, up_sec_key, up_sec_key_len >> >> Recheck Cond: (r.s_t_n = $4) >> >> Filter: ((r.control >= $10) AND (r.t_s_n_type = >> >> $3) >> >> AND ((r.rel_type = $1) OR (r.rel_type = $2)) AND ((((r.prime_key > $5) >> >> OR >> >> ((r.prime_ >> >> key = $5) AND (r.sec_key < $7))) AND (r.sort_key = $9)) OR (r.sort_key >> >> < >> >> $9))) >> >> >> >> -> Bitmap Index Scan on idx_relationship_s_t_n >> >> (cost=0.00..3.39 rows=18 width=0) (actual time=0.951..0.951 rows=6989 >> >> loops=1) >> >> Index Cond: (r.s_t_n = $4) >> > >> > Absolutely. You need to learn and master row-wise comparison. It was >> > added for exactly this purpose :-). >> > >> > SELECT * FROM foo WHERE (a,b,c) > (a1,b1,c1) ORDER BY a,b,c LIMIT k; >> > >> > will be fully optimized if you have an index on a,b,c (a1,b1,c1 are >> > the last ones you read off). Be advised that if there is not a lot of >> > cardinality on 'a', you may need to disable certain index plans to get >> > a good plan in some cases. >> >> hm, one more point: I notice you are mixing ASC/DESC in the index >> definition. Try to avoid doing that: it will make index based paging >> of the table more difficult. If you have to, try transforming the >> values so that you can index all the fields ASC or DESC. This will >> also fit easier into row-wise comparisons strategy although it will >> slow down insertion a bit. >> >> merlin > > Thanks Merlin. I wasn't aware of the row-wise comparison stuff. It's tricky > in the case above though as I want to do (a > b) OR (a=b AND b < c). > > As it happens, the application needs to iterate the set of "next" > relationships, which as I had it meant establishing a new set on each > iteration (where new set = old set minus its first entry). This is what the > filter was doing in the above query. This is too expensive (as ~7000 > iterations) and so have modified solution to cache the ordered set when > first established. > > Even without the filter I still need to order the return as above. Still > don't understand why it doesn't use the index for this. postgres isn't smart enough to convert the complex boolean expressions for multiple field set ordering into a multi-column index lookup unless the row-wise comparison syntax is used. where (a,b) > (a1,b1) can be written as boolean: where (a>a1) OR (a=a1 AND b>b1) alternate form is: where (a>=a1) AND (a>a1 OR b>b1) where (a,b,c) > (a1,b1,c1) can be written as boolean: where (a>a1) OR (a=a1 AND b>b1) OR (a = b1 AND b1 AND c>c1) alternate form is: where (a>=a1) AND (a>a1 OR b>=b1) AND (a>a1 OR b>b1 OR c>=c1) in either case using boolean construction postgres will only use an index on 'a' if it exists, but will fully optimize the row-wise case if it matches the index. therefore, making your query faster using row-wise is going to be an exercise of making an index matching your set browsing that contains either all ASC or all DESC -- you can't mix and match. in order to do that some transformation of your values is in order -- either directly on the table itself or with functional indexes doing the transformation during index build. For integers we can do that by multiplying by -1. other datatypes might have more complicated constructions but it's usually doable. all this is assuming btw that you can logically order your table with a consistent ordering and you are trying to index lookups between some two points withing that ordering (perhaps sliding that windows as you crawl the table) merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance