Search Postgresql Archives

Re: large document multiple regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 26, 2007, at 9:06 AM, Merlin Moncure wrote:
I am receiving a large (300k+_ document from an external agent and
need to reduce a few interesting bits of data out of the document on
an insert trigger into separate fields.

regex seems one way to handle this but is there any way to avoid
rescanning the document for each regex.  One solution I am kicking
around is some C hackery but then I lose the expressive power of
regex.  Ideally, I need to be able to scan some text and return a
comma delimited string of values extracted from it.  Does anybody know
if this is possible or have any other suggestions?

Have you thought about something like ~ '(first_string|second_string| third_string)'? Obviously your example would be more complex, but I believe that with careful crafting, you can get regex to do a lot without resorting to multiple passes.
--
Jim Nasby                                            jim@xxxxxxxxx
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux