On Wed, Mar 21, 2007 at 09:13:55PM +0300, Teodor Sigaev wrote: > >postgres=# select to_tsvector('test text'); > > to_tsvector > >--------------- > > 'test text':1 > >(1 row) > Ok. that's related to > http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/wordparser/parser.c.diff?r1=1.11;r2=1.12;f=h > commit. Thomas pointed that it can be non-breakable space (0xa0) and that > commit assumes any character with C locale and multibyte encoding and > > 0x7f is alpha. > To check theory, pls, apply attached patch. > > If so, I'm confused, we can not assume that 0xa0 is a space symbol in any > multibyte encoding, even in Windows. Nope, same result with this patch. //Magnus