Search Postgresql Archives

Re: Regarding bytea column in Posgresql

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 10 April 2015 at 03:27, John R Pierce <pierce@xxxxxxxxxxxx> wrote:
 
one possible rationale for using BYTEA is that the data could be in various encodings, which the application wishes to preserve, and keeps track of somewhere else (perhaps in a field within the XML?).

Thanks for bringing this up, as it's a good reason to use bytea for XML.

XML actually has an encoding field in the DTD declaration, e.g.

    <?xml version="1.0" encoding="UTF-8"?>

It is common - and of dubious correctness - for applications to store XML in a 'text' or 'xml' field without changing the 'encoding' field in the doctype to reflect the encoding at rest.

Personally I wish the 'xml' type in Pg knew how to change the encoding declaration dynamically, but I know it's a hairy problem; e.g. if the client_encoding is iso-8859-1, but the client then converts the XML document to utf-8 internally, the encoding will be wrong if the client doesn't change it back.

I've also run into XML documents that shove data in different encodings into CDATA sections. This is wrong, of course, but apps sometimes do it anyway.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux