On Thursday 26 June 2008 15:41, Michael Fuhr wrote: > On Thu, Jun 26, 2008 at 03:31:01PM +0200, Albe Laurenz wrote: > > Michael Fuhr wrote: > > > Your input data seems to have a mix of encodings: sometimes you're > > > getting pound signs in a non-UTF-8 encoding, but if characters like > > > <U+2019 RIGHT SINGLE QUOTATION MARK> got into the database when > > > client_encoding was set to UTF8 then at least some data must have > > > been in UTF-8. > > > > Sorry, but that's not true. > > That character is 0x9s in WINDOWS-1252. > > I think you mean 0x92. > > > So it could have been that client_encoding was (correctly) set to WIN1252 > > and the quotation mark was entered as a single byte character. > > Yes, *if* client_encoding was set to win1252. However, in the > following thread Garry said that he was getting encoding errors > when entering the pound sign that were resolved by changing > client_encoding (I suggested latin1, latin9, or win1252; he doesn't > say which he used): > > http://archives.postgresql.org/pgsql-general/2008-06/msg00526.php > > If client_encoding had been set to win1252 then Garry wouldn't have > gotten encoding errors when entering the pound sign because that > character is 0xa3 in win1252 (also in latin1 and latin9). So either > applications are setting client_encoding to different values, > sometimes correctly and sometimes incorrectly (Garry, do you know > if that could be happening?), or the data is sometimes in different > encodings. If the data is being entered via a web application This is the case and so I need some way to tell the browser to send the correct encoding - still researching. regards Garry > then > the latter seems more likely, at least in my experience (I've had > to deal with exactly this problem recently). > > -- > Michael Fuhr