Search Postgresql Archives

Re: using Tsearch2 for chemical text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Naz, in posted link to the dict_regex dictionary for tsearch2 http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html

Feel free to test it and send us feedback. It's rather general, of course,
it uses regex (pcre library).

Oleg
On Thu, 26 Jul 2007, Naz Gassiep wrote:


I think you might need to write a custom lexer to divide the strings
into meaningful units.  If there are subsections of these names that
make sense to search for, then tsearch2 can certainly handle the
mechanics of that, but I doubt that the standard rules will divide
these names into lexemes usefully.

A custom lexer for tsearch2 that recognized chemistry related lexical components (di-, tetra-, acetyl-, ethan-, -oic, -ane, -ene etc) would increase *hugely* the out-of-the-box applicability of PostgreSQL to scientific applications. Perhaps such an effort could be co ordinated with a physics based lexer and biology related lexer, to perhaps provide a unified lexer that provided full scientific capabilities in the way that PostGIS provides unified geospatial capabilities.

I don't know how best to bring such an effort about, but I do know that if such a thing were created it would be a boon for PostgreSQL, giving it a very significant leg up in terms of functionality, not to mention the great positive impact that the wide, free availability of such a tool would have on the scientific research community.


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux