Search Postgresql Archives

Re: looking for a faster way to do that

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 18:18 26/09/2011, you wrote:

Eduardo Morras <nec556@xxxxxxxxxx> wrote:

>>
>> At 08:04 25/09/2011, hamann.w@xxxxxxxxxxx wrote:
>>
>> > >> select * from items where regexp_matches(items.code,'(ABC) (DE1)
>> > >> (any_substring)')<>{};
>> > >>
>> >
>> >Hi Eduardo,
>> >
>> >it is clear that scanning the table once with a list of matches will
>> >outperform
>> >rescanning the table for every string wanted. Now, my problem is
>> >that the patterns are
>> >dynamic as well. So if I could translate a table with one
>> >column  and a few thousand rows
>> >into something like
>> >regexp_matches(code,'string1|string2|.....string2781')
>> >would ideally be a performant query. Unfortunately I have no idea
>> >how I could achieve this
>> >transformation inside the database. Doing it externally fails,
>> >because any single query cannot
>> >be more than so few characters.
>>
>> You can create a plsql function and pass a setof text that do it.
>> Sorry but instead saying you What Must You Type, WMYT(c), i prefer
>> the How Should You Do way, HSYD(c). Note that you can get the same
>> results using other approachs (f.ex. using FTS described in chapter 12)
>>
>> Check this topics:
>>
>> Function
>> Creation http://www.postgresql.org/docs/9.0/static/sql-createfunction.html
>>
>> Tutorial about Function
>> Creation  http://www.adderpit.com/practical-postgresql/x10374.htm
>>

Hi,

I tried the pl/sql approach to convert the contents of that patterns table into a regex.
Results: 40 seconds runtime for 9500 candidates and 815 patterns
718 seconds for the same set of 9500 candidates, but using 4000 patterns instead.
So it seems that I am reaching limits of pattern match

Perhaps calling the function twice with half the values go faster. How do you call the function? EXECUTE or SELECT? If you use EXECUTE then the prepared plan in a previous call is ignored and is usually faster. Don't know if in your case it run faster but you can try it.

As for the fulltext index (and the underlying tsquery): this is an exact match rather than prefix match, so I would need to know match patterns in advance in order to build the index

I am thinking about that anyway (because ABC1234 likely should not match ABC123 pattern in my context), but I would sort of prefer a system where I can state the rules when I
see the data set, rather than having to pre-create an index.

Thanks for the tutorial link :)

It's the one i used time ago. A bit old but very good one.

It seems that the responses on my post give all sorts of input that will help me on other
tasks

Regards
Wolfgang Hamann



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux