Search Postgresql Archives

Re: using Tsearch2 for chemical text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 25 Jul 2007, Rajarshi Guha wrote:

Hi, I have a table with about 9M entries. The table has 2 fields: id and name which are of serial and text types respectively. I have a ordinary index on the text field which allows me to do searches in reasonable time. Most of my searches are of the form

select * from mytable where name ~ 'some text query'

I know that the Tsearch2 module will let me have very efficient text searches. But if I understand correctly, it's based on a language specific dictionary.

wrong ! it comes with some written human language dictionaries, but you can
write your very own dictionaries. dictionary is just a C-program.


My problem is that the name column contains names of chemicals. Now for many cases this may simply be a number (1674-56-2) and in other cases it may be an alphanumeric string (such as (-)O-acetylcarnitine or 1,2-cis-dihydroxybenzoate). In some cases it is a well-known word (say viagra or calcium chloride or pentathol).

My question is: will Tsearch2 be able to handle this type of text? Or will it be hampered by the fact that the bulk of the rows do not correspond to ordinary English

Oh, sure. See, for example, our dict_regex dictionary, we use for
astronomical search.
http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html

This is a work in progress, but it works.


-------------------------------------------------------------------
Rajarshi Guha  <rguha@xxxxxxxxxxx>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
My Ethicator machine must have had a built-in moral
compromise spectral phantasmatron! I'm a genius."
              -Calvin



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
    choose an index scan if your joining column's datatypes do not
    match

	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux