Search Postgresql Archives

Re: Matching uppercased russian words (\x0410-\x042F) in UTF8 database 8.4.13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for trying! I am using CentOS 6.3

It seems to be better in 9.2.x?

Unfortunately I'd like to stay with 8.4.x for now
(because I use the PostgreSQL instance
with other projects at the same host)....

Regards
Alex


On Wed, Mar 20, 2013 at 10:35 AM, Albe Laurenz <laurenz.albe@xxxxxxxxxx> wrote:
> Alexander Farber wrote:
>> I have prepared an SQL fiddle for my question:
>> http://sqlfiddle.com/#!11/8a494/4
>>
> Strange, it works here (RHEL 6, x86_64, PostgreSQL 9.2.2,
> encoding "UTF8", collation and ctype "de_DE.UTF8"):
>
> test=> SELECT 'ПРОВЕРКА' ~ '^[\u0410-\u042F]{2,}$';
>  ?column?
> ----------
>  t
> (1 row)
>
> test=> SELECT 'ABCDE' ~ '^[\u0410-\u042F]{2,}$';
>  ?column?
> ----------
>  f
> (1 row)
>
>> create table good_words (
>>         word varchar(64) primary key
>> );
>>
>> create or replace function keep_clean() returns trigger as $body$
>>         begin
>>                 new.word := upper(new.word);
>>
>>                 /* next line does not compile? */
>>                 IF new.word !~ '^[\x0410-\x042F]{2,}$' THEN
>>                     RAISE EXCEPTION 'Not an uppercased Russian word in UTF8';
>>                 END IF;
>>
>>                 IF new.word ~ '^[ЪЫЬ]' OR new.word ~ 'Ъ$' THEN
>>                     return NULL;
>>                 END IF;
>>
>>                 /* does not return NULL for 'ошибббка'? */
>>                 IF new.word ~ '(.)\1\1' AND new.word NOT LIKE '%ШЕЕЕ%'
>> AND new.word NOT LIKE '%ЗМЕЕЕ%' THEN
>>                     return NULL;
>
> This works for me as well:
>
> test=> SELECT 'ошибббка' ~ '(.)\1\1'
>           AND 'ошибббка' NOT LIKE '%ШЕЕЕ%'
>           AND 'ошибббка' NOT LIKE '%ЗМЕЕЕ%';
>  ?column?
> ----------
>  t
> (1 row)
>
> test=> SELECT 'ошиббка' ~ '(.)\1\1'
>           AND 'ошиббка' NOT LIKE '%ШЕЕЕ%'
>           AND 'ошиббка' NOT LIKE '%ЗМЕЕЕ%';
>  ?column?
> ----------
>  f
> (1 row)
>
>>                 END IF;
>>
>>                 return new;
>>         end;
>> $body$ language plpgsql;
>
> What do you get for
>
> SELECT pg_encoding_to_char(encoding),
>        datcollate,
>        datctype
> FROM pg_database WHERE datname = current_database();
>
> and for
>
> SHOW client_encoding;
>
> Yours,
> Laurenz Albe
>
> --
> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux