Search Postgresql Archives

Re: Clarification of the "simple" dictionary

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Don't guess, but read docs
http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY

12.6.2. Simple Dictionary

The simple dictionary template operates by converting the input token to lower case and checking it against a file of stop words. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-cased form of the word is returned as the normalized lexeme. Alternatively, the dictionary can be configured to report non-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list.

d=# \dFd+ simple
                                          List of text search dictionaries
Schema | Name | Template | Init options | Description ------------+--------+-------------------+--------------+-----------------------------------------------------------
 pg_catalog | simple | pg_catalog.simple |              | simple dictionary: just lower case and check for stopword

By default it has no Init options, so it doesn't check for stopwords.

On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote:

On 07/22/2010 06:27 PM, John Gage wrote:
The easiest way to look at this is to give the simple dictionary a document with to_tsvector() and see if stopwords pop out.

In my experience they do. In my experience, the simple dictionary just breaks the document down into the space etc. separated words in the document. It doesn't analyze further.

That's my experience too, I just want to make sure it doesn't actually have any stopwords which I've missed. Trying many phrases and checking for stopwords isn't really proving anything.

Can anybody confirm the "simple" dict. only lowercases the words and "uniques" them?



	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux