I have updated the documentation to be more direct about COPY encoding behavior. Patch attached and applied. --------------------------------------------------------------------------- Peter Headland wrote: > > Maybe the link might help? > > > > http://www.postgresql.org/docs/8.4/interactive/multibyte.html > > That page is too generic; what would be helpful is a section in the doc for each command that is affected by I18N/L10N considerations, that identifies how that specific command behaves. > > Now that I have grasped the behavior, I'm more than happy to edit the COPY doc page, if people think that would be helpful/worthwhile. > > -- > Peter Headland > Architect > Actuate Corporation > > > -----Original Message----- > From: Adrian Klaver [mailto:aklaver@xxxxxxxxxxx] > Sent: Thursday, September 10, 2009 11:06 > To: Peter Headland > Cc: pgsql-general@xxxxxxxxxxxxxx; Tom Lane > Subject: Re: COPY command character set > > > ----- "Peter Headland" <pheadland@xxxxxxxxxxx> wrote: > > > > The COPY command reference page saith > > > > > > Input data is interpreted according to the current client > > encoding, > > > and output data is encoded in the the current client encoding, > > even > > > if the data does not pass through the client but is read from or > > > written to a file. > > > > Rats - I read the manual page twice and that didn't register on my > > feeble consciousness. I suspect that I didn't look beyond the word > > "client", since I knew I wasn't interested in client behavior and I > > was > > speed-reading. On the assumption that I am not uniquely stupid, maybe > > we > > could re-phrase this slightly, with a "for example", and add a > > heading > > "Localization"? > > > > As a general comment, I18N/L10N is a hairy enough topic that it > > merits > > its own heading in any commands where it is an issue. > > > > How about my suggestion to add a means (extend COPY syntax) to > > specify > > encoding explicitly and handle UTF lead bytes - would that be of > > interest? > > > > -- > > Peter Headland > > Architect > > Actuate Corporation > > > > > > > The COPY command reference page saith > > > > Input data is interpreted according to the current client > > encoding, > > and output data is encoded in the the current client encoding, > > even > > if the data does not pass through the client but is read from or > > written to a file. > > > > Seems clear enough to me. > > > > regards, tom lane > > Maybe the link might help? > > http://www.postgresql.org/docs/8.4/interactive/multibyte.html > > > Adrian Klaver > aklaver@xxxxxxxxxxx > > -- > Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/copy.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v retrieving revision 1.93 diff -c -c -r1.93 copy.sgml *** doc/src/sgml/ref/copy.sgml 17 Feb 2010 04:19:39 -0000 1.93 --- doc/src/sgml/ref/copy.sgml 23 Feb 2010 05:15:00 -0000 *************** *** 367,376 **** </para> <para> ! Input data is interpreted according to the current client encoding, ! and output data is encoded in the the current client encoding, even ! if the data does not pass through the client but is read from or ! written to a file. </para> <para> --- 367,376 ---- </para> <para> ! <command>COPY</command> always processes data according to the ! current client encoding, even if the data does not pass through ! the client but is read from or written to a file directly by the ! server. </para> <para>
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general