Re: Strange (?) Index behavior?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, you thought maybe this thread died or got abandoned in the face of
all the senseless trolling and spam going on.. you were wrong.. ;)

I thought though I'd start over trying to explain what's going on. 
I've gone through some dumps, and recreation of the database with some
different filesystem options and whatnot, and starting over fresh
here's the situation.

First, the structure.

CREATE TABLE testtable (
  nid serial UNIQUE NOT NULL,
  sname text NOT NULL,
  iother int4
);

CREATE UNIQUE INDEX idx_sname_unique ON testtable (sname);

-----

With the above, the query "SELECT sname FROM testtable WHERE sname
LIKE 'A%';" DOES use an index scan on idx_sname_unique -- sometimes. 
Other times, the planner thinks a sequential scan would be better.

The index is large.  There are over 70 million rows in this table. 
The estimated cost and so forth from EXPLAIN on the above query is way
off as well, but I expect that to be the case considering the size of
the table -- perhaps there is a tunable in the statistics gathering
backend ot fix this?

My goal was to obviously make queries of the above type, as well as
more refined ones such as "... LIKE 'AB%';" faster.

This goal in mind, I thought that creating several indexes (36 of
them) would speed things up -- one index per alphanumeric start
character, via..

CREATE INDEX idx_sname_suba ON testtable (sname) WHERE sname LIKE 'A%';
CREATE INDEX idx_sname_subb ON testtable (sname) WHERE sname LIKE 'B%';
...
CREATE INDEX idx_sname_subz ON testtable (sname) WHERE sname LIKE 'Z%';

(also including 0..9)

I've wracked my brain trying to come up with other ways of doing this,
including partitioning the table, and trying the suggestions here such
as "substr(1,1)" in the index creation instead of creating many
distinct indexes.

None of these seems to speed up the queries enough to make them
"acceptable" when it comes to runtimes.  My data from before was
somehow in error.. not sure why.  At this point, using one index vs.
the other the runtimes are about the same.

search=# explain analyze
search-# SELECT sname FROM
search-# (SELECT sname FROM testtable WHERE sname LIKE 'A%') AS subq
search-# WHERE sname LIKE 'AA%';
                                                               QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using sname_a on "testtable"  (cost=0.00..189.41 rows=47
width=20) (actual time=16.219..547053.251 rows=74612 loops=1)
   Index Cond: ((sname >= 'A'::text) AND (sname < 'B'::text) AND
(sname >= 'AA'::text) AND (sname < 'AB'::text))
   Filter: ((sname ~~ 'A%'::text) AND (sname ~~ 'AA%'::text))
 Total runtime: 547454.939 ms
(4 rows)

Time: 547458.216 ms


search=# explain analyze
search-# SELECT sname FROM testtable WHERE sname LIKE 'AA%';
                                                                 
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using sname_unique on "testtable"  (cost=0.00..34453.74
rows=8620 width=20) (actual time=77.004..537065.079 rows=74612
loops=1)
   Index Cond: ((sname >= 'AA'::text) AND (sname < 'AB'::text))
   Filter: (sname ~~ 'AA%'::text)
 Total runtime: 537477.737 ms
(4 rows)

Time: 537480.571 ms


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux