Search Postgresql Archives

Re: [HACKERS] tsearch2 in postgresql 8.3.1 - invalid byte sequence for encoding "UTF8": 0xc3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Martijn van Oosterhout <kleptog@xxxxxxxxx> writes:
> On Wed, Mar 19, 2008 at 07:55:40PM -0400, Tom Lane wrote:
>> (that's \303\240 or 0xc3 0xa0).  I am thinking that something decided
>> the \240 was junk and removed it.

> Hmm, it is coincidently the space character +0x80, which is defined as
> a non-breaking space in many Latin encodings.

Yeah, that's what I'm thinking about.  I poked around in Microsoft's
documentation and couldn't find any suggestion that fgets() would
remove such a character, however.

Another possible theory is that the french.stop file got edited using
something that had the wrong idea about the file's encoding, and
proceeded to throw away the nbsp.

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux