Search Postgresql Archives

Re: split string by special characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Current design of function
---text_to_array( sen text, col collection )---
DECLARE
  delimiter_pos int[] := '{}';
  return_val text[] := '{}';
  delimiters_chr RECORD;
  remaining_text text;
  delimiter_curr text;
  i int := 0;
  j int := 0;
  prev_pos int := 1;
  curr_pos int := 0;
  delimiter_len int;
BEGIN
  FOR delimiters_chr IN SELECT upper( chars.chars ) AS chars FROM
chars, word WHERE chars.chars_id = word.chars_id AND
word.classification_id = clss.classification_id LOOP
    delimiter_len := character_length( delimiters_chr.chars ); --
Length of the current delimiter we're looking at
    IF prev_pos > 0 THEN -- Don't add text if it hasn't been altered
(previous delimiter not found)
        remaining_text := upper( sen ); -- Temporary copy of text to
search through
    END IF;
    prev_pos := 0;
    curr_pos := position( delimiters_chr.chars IN sen ); -- Delimiter
position in text (sentence)

    WHILE curr_pos > 0 LOOP -- Found delimiter in text
      delimiter_pos := array_append( delimiter_pos, curr_pos +
prev_pos ); -- Add position of delimiter
      delimiter_pos := array_append( delimiter_pos, curr_pos +
prev_pos + delimiter_len ); -- Add position where delimiter ends
      remaining_text := substring( remaining_text FROM curr_pos +
delimiter_len ); -- Look only at text still
      prev_pos := curr_pos + prev_pos; -- Need to keep track of how
much text has been stripped off
      curr_pos := position( delimiters_chr.chars IN remaining_text );
-- Calculate the next occurance
    END LOOP;
  END LOOP;
  remaining_text := sen; -- Temporary copy of text to extract text
from
  SELECT DISTINCT * INTO delimiter_pos FROM array_append( array_prepend
( 1, sort( delimiter_pos ) ), character_length( sen ) + 1 ); -- Add
first character pos and last to the sorted array.
  curr_pos := array_upper( delimiter_pos, 1 ); -- Calculate the size
of the array
  WHILE j < curr_pos LOOP -- Look through the array
      j := i + 1;
      return_val := array_append( return_val, substring
( remaining_text FROM delimiter_pos[i] FOR (delimiter_pos[j] -
delimiter_pos[i]) ) ); -- Add the parts to a new array with text
      i := i + 1;
  END LOOP;
  RETURN return_val; -- Return the result
END;
----
Need an additional function as well, to sort the array.
The delimiters has to be placed in another table.

Please note that it can't handle variations to a delimiter such as
<table style="..."> due to the present design as I couldn't use
regexp. <table> (case doesn't matter) without any other extras, work
fine though. I initially hoped it would be possible to allow matchings
such as <td(.*)> as well as pairs <td>(.*)</td> and variations
thereof, such as <td(.*)>(.*)</td>, but that's something I need a bit
of help with (hint, hint).

Written to run on eComStation v2.0 (successor of OS/2)
//Jan-Erik

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux