> On 01 May 2015, at 13:54, David Osborne <david@xxxxxxxxxxx> wrote: > > Hi, > > We have a query which finds the latest row_id for a particular code. > > We've found a backwards index scan is much slower than a forward one, to the extent that disabling indexscan altogether actually improves the query time. > > Can anyone suggest why this might be, and what's best to do to improve the query time? > > > > dev=> \d table > Table "public.table" > Column | Type | Modifiers > --------------+--------------------------------+----------- > row_id | integer | > code | character(2) | > Indexes: > "table_code_idx" btree (code) > "table_row_idx" btree (row_id) > > dev=> select count(*) from table; > count > --------- > 6090254 > (1 row) > > dev=> select count(distinct(row_id)) from table; > count > --------- > 5421022 > (1 row) > > dev=> select n_distinct from pg_stats where tablename='table' and attname='row_id'; > n_distinct > ------------ > -0.762951 > (1 row) > > dev=> show work_mem; > work_mem > ----------- > 1249105kB > (1 row) > > dev=> select version(); > version > -------------------------------------------------------------------------------------------------------------- > PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2), 64-bit > (1 row) > > > The query in question: > > dev=> explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.43..1.67 rows=1 width=4) (actual time=835.281..835.282 rows=1 loops=1) > Buffers: shared hit=187961 > -> Index Scan Backward using table_row_idx on table (cost=0.43..343741.98 rows=278731 width=4) (actual time=835.278..835.278 rows=1 loops=1) > Filter: (code = 'XX'::bpchar) > Rows Removed by Filter: 4050971 > Buffers: shared hit=187961 > Total runtime: 835.315 ms > (7 rows) > > http://explain.depesz.com/s/uGC > > > So we can see it's doing a backwards index scan. Out of curiosity I tried a forward scan and it was MUCH quicker: > > dev=> explain (analyse,buffers) select row_id as first_row_id from table where code='XX' order by row_id asc limit 1; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.43..1.67 rows=1 width=4) (actual time=19.473..19.474 rows=1 loops=1) > Buffers: shared hit=26730 > -> Index Scan using table_row_idx on table (cost=0.43..343741.98 rows=278731 width=4) (actual time=19.470..19.470 rows=1 loops=1) > Filter: (code = 'XX'::bpchar) > Rows Removed by Filter: 62786 > Buffers: shared hit=26730 > Total runtime: 19.509 ms > (7 rows) > > http://explain.depesz.com/s/ASxD > > > I thought adding a index on row_id desc might be the answer but it has little effect: > > dev=> create index row_id_desc_idx on table(row_id desc); > CREATE INDEX > Time: 5293.812 ms > > dev=> explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=0.43..1.66 rows=1 width=4) (actual time=944.666..944.667 rows=1 loops=1) > Buffers: shared hit=176711 read=11071 > -> Index Scan using row_id_desc_idx on table (cost=0.43..342101.98 rows=278731 width=4) (actual time=944.663..944.663 rows=1 loops=1) > Filter: (code = 'XX'::bpchar) > Rows Removed by Filter: 4050971 > Buffers: shared hit=176711 read=11071 > Total runtime: 944.699 ms > (7 rows) > > http://explain.depesz.com/s/JStM > > In fact, disabling the index scan completely improves matters considerably: > > dev=> drop index row_id_desc_idx; > DROP INDEX > dev=> set enable_indexscan to off; > SET > > dev=> explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=74006.39..74006.39 rows=1 width=4) (actual time=183.997..183.998 rows=1 loops=1) > Buffers: shared hit=14723 > -> Sort (cost=74006.39..74703.22 rows=278731 width=4) (actual time=183.995..183.995 rows=1 loops=1) > Sort Key: row_id > Sort Method: top-N heapsort Memory: 25kB > Buffers: shared hit=14723 > -> Bitmap Heap Scan on table (cost=5276.60..72612.74 rows=278731 width=4) (actual time=25.533..119.320 rows=275909 loops=1) > Recheck Cond: (code = 'XX'::bpchar) > Buffers: shared hit=14723 > -> Bitmap Index Scan on table_code_idx (cost=0.00..5206.91 rows=278731 width=0) (actual time=23.298..23.298 rows=275909 loops=1) > Index Cond: (code = 'XX'::bpchar) > Buffers: shared hit=765 > Total runtime: 184.043 ms > (13 rows) > > http://explain.depesz.com/s/E9VE Your queries are slow not because of backward scan. They are slow because of Rows Removed by Filter: 4050971 Try creating index on (code, row_id). > > Thanks in advance for any help. > > Regards, > -- > David Osborne > Qcode Software Limited > http://www.qcode.co.uk > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance