On Wed, 25 Jul 2007, Rajarshi Guha wrote:
Hi, I have a table with about 9M entries. The table has 2 fields: id and name
which are of serial and text types respectively. I have a ordinary index on
the text field which allows me to do searches in reasonable time. Most of my
searches are of the form
select * from mytable where name ~ 'some text query'
I know that the Tsearch2 module will let me have very efficient text
searches. But if I understand correctly, it's based on a language specific
dictionary.
wrong ! it comes with some written human language dictionaries, but you can
write your very own dictionaries. dictionary is just a C-program.
My problem is that the name column contains names of chemicals. Now for many
cases this may simply be a number (1674-56-2) and in other cases it may be an
alphanumeric string (such as (-)O-acetylcarnitine or
1,2-cis-dihydroxybenzoate). In some cases it is a well-known word (say viagra
or calcium chloride or pentathol).
My question is: will Tsearch2 be able to handle this type of text? Or will it
be hampered by the fact that the bulk of the rows do not correspond to
ordinary English
Oh, sure. See, for example, our dict_regex dictionary, we use for
astronomical search.
http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html
This is a work in progress, but it works.
-------------------------------------------------------------------
Rajarshi Guha <rguha@xxxxxxxxxxx>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
My Ethicator machine must have had a built-in moral
compromise spectral phantasmatron! I'm a genius."
-Calvin
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster