Re: Help with High value unicode characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 07, 2007 at 05:09:35PM -0400, Chris Hoover wrote:
> We need some help, we have some what we believe are high value unicode
> characters (Unicode 0x2).

What do you mean by "high value unicode characters (Unicode 0x2)"?
Characters with code points in a plane other than Plane 0 (BMP,
Basic Multilingual Plane), i.e., with a code point greater than
U+FFFF?

> How can you search and replace for these?  We are storing this data
> in a text field, and having the data contain this unicode value is
> violating our xml rules the application uses and causing abends in
> our application.

If I understand what you're asking then you should be able to use
regexp_replace (8.1 and later) to fix the data.  Example:

UPDATE tablename
   SET columnname = regexp_replace(columnname, E'[\\U00010000-\\U0010FFFF]+', '', 'g')
 WHERE columnname ~ E'[\\U00010000-\\U0010FFFF]';

If that doesn't help then please clarify the problem.

-- 
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux