Search Postgresql Archives

Re: looking for a faster way to do that

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 14:12 23/09/2011, hamann.w@xxxxxxxxxxx wrote:
Eduardo Morras wrote:

>> You can try these, i doubt they will use any index but its a
>> different approach:
>>
>> select * from items where length(items.code)<>length(rtrim(items.code,'ABC'));
>>
>> select * from items where strpos(items.code,'ABC')=0 or
>> strpos(items.code,'any_substring')=0;

Hi,

if I understand this right, it does not mean "check if the string appears at position 0" which could translate into an index query, but rather "check if the string appears anywhere
and then check if that is position 0", so the entire table is checked.

The second one yes, as it checks all patterns you want only one time per row they only needs one table scan. The first one eliminates the substring 'ABC' from the string, if the lengths of both strings are equal, the substring 'ABC' wasn't in it. If they are different, the trimmed string will be shorter.

explain analyze select items.num, wantcode from items, n where strpos(code, wantcode) = 0; Nested Loop (cost=167.14..196066.54 rows=39178 width=36) (actual time=0.074..36639.312 rows=7832539 loops=1)
   Join Filter: (strpos(("inner".code)::text, "outer".wantcode) = 0)
-> Seq Scan on n (cost=0.00..14.15 rows=815 width=32) (actual time=0.005..2.212 rows=815 loops=1) -> Materialize (cost=167.14..263.28 rows=9614 width=42) (actual time=0.007..13.970 rows=9614 loops=815) -> Seq Scan on items (cost=0.00..167.14 rows=9614 width=42) (actual time=0.044..14.855 rows=9614 loops=1)
 Total runtime: 46229.836 ms


The query ran much faster than the pattern query, however. This seems to be the performance of just searching for a plain string vs. initializing the regex engine every time (for 815
queries in a test set)

It will do only one table scan while your original code will do one for each substring you want to test. You can add more and more substrings without too much cost. If you want to use the regex engine instead the postgresql string funtions check the regexp_matches(), it should be faster if you have 3000 substrings.

select * from items where regexp_matches(items.code,'(ABC) (DE1) (any_substring)')<>{};

Regards
Wolfgang Hamann

HTH


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux