Eduardo Morras wrote: >> > >> >Hi, >> > >> >if I understand this right, it does not mean "check if the string >> >appears at position 0" >> >which could translate into an index query, but rather "check if the >> >string appears anywhere >> >and then check if that is position 0", so the entire table is checked. >> >> The second one yes, as it checks all patterns you want only one time >> per row they only needs one table scan. The first one eliminates the >> substring 'ABC' from the string, if the lengths of both strings are >> equal, the substring 'ABC' wasn't in it. If they are different, the >> trimmed string will be shorter. >> >> >explain analyze select items.num, wantcode from items, n where >> >strpos(code, wantcode) = 0; >> > Nested Loop (cost=167.14..196066.54 rows=39178 width=36) (actual >> > time=0.074..36639.312 rows=7832539 loops=1) >> > Join Filter: (strpos(("inner".code)::text, "outer".wantcode) = 0) >> > -> Seq Scan on n (cost=0.00..14.15 rows=815 width=32) (actual >> > time=0.005..2.212 rows=815 loops=1) >> > -> Materialize (cost=167.14..263.28 rows=9614 width=42) >> > (actual time=0.007..13.970 rows=9614 loops=815) >> > -> Seq Scan on items (cost=0.00..167.14 rows=9614 >> > width=42) (actual time=0.044..14.855 rows=9614 loops=1) >> > Total runtime: 46229.836 ms >> > >> > >> >The query ran much faster than the pattern query, however. This >> >seems to be the performance >> >of just searching for a plain string vs. initializing the regex >> >engine every time (for 815 >> >queries in a test set) >> >> It will do only one table scan while your original code will do one >> for each substring you want to test. You can add more and more >> substrings without too much cost. If you want to use the regex engine >> instead the postgresql string funtions check the regexp_matches(), it >> should be faster if you have 3000 substrings. >> >> select * from items where regexp_matches(items.code,'(ABC) (DE1) >> (any_substring)')<>{}; >> Hi Eduardo, it is clear that scanning the table once with a list of matches will outperform rescanning the table for every string wanted. Now, my problem is that the patterns are dynamic as well. So if I could translate a table with one column and a few thousand rows into something like regexp_matches(code,'string1|string2|.....string2781') would ideally be a performant query. Unfortunately I have no idea how I could achieve this transformation inside the database. Doing it externally fails, because any single query cannot be more than so few characters. Regards Wolfgang Hamann -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general