On Jan 26, 2007, at 9:06 AM, Merlin Moncure wrote:
I am receiving a large (300k+_ document from an external agent and need to reduce a few interesting bits of data out of the document on an insert trigger into separate fields. regex seems one way to handle this but is there any way to avoid rescanning the document for each regex. One solution I am kicking around is some C hackery but then I lose the expressive power of regex. Ideally, I need to be able to scan some text and return a comma delimited string of values extracted from it. Does anybody know if this is possible or have any other suggestions?
Have you thought about something like ~ '(first_string|second_string| third_string)'? Obviously your example would be more complex, but I believe that with careful crafting, you can get regex to do a lot without resorting to multiple passes.
-- Jim Nasby jim@xxxxxxxxx EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)