Thanks and I didn't know about ts_debug, so thanks for that also. For the record, I see how to use my own processing function (e.g. dropatsymbol) to get what I need: http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro .html However, can you explain the logic behind the parsing difference if I just add a ".s" to a string: ossdb=# select ts_debug('gallery2-httpd-2.1-conf.'); ts_debug ----------------------------------------------------------------------- (default,hword,"Hyphenated word",gallery2-httpd-2,{simple},"'2' 'httpd' 'gallery2' 'gallery2-httpd-2'") (default,part_hword,"Part of hyphenated word",gallery2,{simple},'gallery2') (default,lpart_hword,"Latin part of hyphenated word",httpd,{en_stem},'httpd') (default,float,"Decimal notation",2.1,{simple},'2.1') (default,lpart_hword,"Latin part of hyphenated word",conf,{en_stem},'conf') (5 rows) ossdb=# select ts_debug('gallery2-httpd-2.1-conf.s'); ts_debug --------------------------------------------------------------------- (default,host,Host,gallery2-httpd-2.1-conf.s,{simple},'gallery2-httpd-2.1-c onf.s') (1 row) Thanks again, Bob On 9/6/07 11:19 AM, "Oleg Bartunov" <oleg@xxxxxxxxxx> wrote: > This is how default parser works. See output from > select * from ts_debug('gallery2-httpd-conf'); > and > select * from ts_debug('httpd-2.2.3-5.src.rpm'); > > All token type: > > select * from token_type(); > > > On Thu, 6 Sep 2007, RC Gobeille wrote: > >> I'm having trouble understanding to_tsvector. (PostreSQL 8.1.9 contrib) >> >> In this first case converting 'gallery2-httpd-conf' makes sense to me and is >> exactly what I want. It looks like the entire string is indexed plus the >> substrings broken by '-' are indexed. >> >> >> ossdb=# select to_tsvector('gallery2-httpd-conf'); >> to_tsvector >> --------------------------------------------------------- >> 'conf':4 'httpd':3 'gallery2':2 'gallery2-httpd-conf':1 >> >> >> However, I'd expect the same to happen in the httpd example - but it does not >> appear to. >> >> ossdb=# select to_tsvector('httpd-2.2.3-5.src.rpm'); >> to_tsvector >> --------------------------- >> 'httpd-2.2.3-5.src.rpm':1 >> >> Why don't I get: 'httpd', 'src', 'rpm', 'httpd-2.2.3-5.src.rpm' ? >> >> Is this a bug or design? >> >> >> Thank you! >> Bob > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend