Search Postgresql Archives

Re: using Tsearch2 for chemical text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




I think you might need to write a custom lexer to divide the strings
into meaningful units.  If there are subsections of these names that
make sense to search for, then tsearch2 can certainly handle the
mechanics of that, but I doubt that the standard rules will divide
these names into lexemes usefully.

A custom lexer for tsearch2 that recognized chemistry related lexical components (di-, tetra-, acetyl-, ethan-, -oic, -ane, -ene etc) would increase *hugely* the out-of-the-box applicability of PostgreSQL to scientific applications. Perhaps such an effort could be co ordinated with a physics based lexer and biology related lexer, to perhaps provide a unified lexer that provided full scientific capabilities in the way that PostGIS provides unified geospatial capabilities.

I don't know how best to bring such an effort about, but I do know that if such a thing were created it would be a boon for PostgreSQL, giving it a very significant leg up in terms of functionality, not to mention the great positive impact that the wide, free availability of such a tool would have on the scientific research community.


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux