Search Postgresql Archives

Re: Can we make regexp processing more friendly by recognizing "\r\n" as a "newline" for "^$" purposes?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David:

On Sun, Oct 18, 2015 at 7:49 PM, David G. Johnston
<david.g.johnston@xxxxxxxxx> wrote:
> Other implementation of regular expressions handle "newline" mechanics
> related to "^" and "$" semantically instead of literally.  By that I mean
> that both "\r\n" and "\n" are considered "newlines" instead of just "\n".

Which ones ? AFAIK this kind of thing is usually done by C ( and
related ) runtimes when reading text files.

At least in my machine perl does not do it:

censored:~$ perl -e 'print( ("A\r\n" =~ /A$/) ? "matched\n" : "NO MATCH\n");'
NO MATCH
censored:~$ perl -e 'print( ("A\r\n" =~ /A.$/) ? "matched\n" : "NO MATCH\n");'
matched
censored:~$ perl -e 'print( ("A\r\n" =~ /A\s$/) ? "matched\n" : "NO MATCH\n");'
matched

Normally when reading lines in CP/M and related ( MSDOS, Windows ) the
CRT does collapse them ( and sometimes just zaps \r, or collapse any
run, or consider [\r*]\n[\r*] or.... ). But I normally do not see that
behaviour in regexes.

> If changing behavior is not desirable I would be content with another flag
> that would toggle such behavior.
> In code - both of these subqueries should match whereas presently only the
> first one does.
> SELECT regexp_matches(E'123\n',   E'123$', 'w');
> SELECT regexp_matches(E'123\r\n', E'123$', 'w');
> I don't know if this is server O/S dependent...but I would not expect it to
> be so.

Neither do I ( expect it to be os dep. ) , but I find the current
behaviour correct. I mean, newline stuff is OS dependent, and you
should convert when ingesting data, when matching them it should
already have been converted to whatever the language uses for newlines
( in C and perl that means \n, which needs not be \012, BTW . In unix
\n=\012 on disk, on CP/M it's \015\012 and when I worked with Mac (
before the unixy osX they use now ) it was \015, and I cannot think on
what they can use on EBCDIC machines ).

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux