Search Postgresql Archives

Re: Multiple word synonyms (maybe?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2015-10-20 at 19:35 +0900, Tim van der Linden wrote:
> Hi All
> 
> I have a question regarding PostgreSQL's full text capabilities and
> (presumably) the synonym dictionary.
> 
> I'm currently implementing FTS on a medical themed setup which uses
> domain specific jargon to denote a bunch of stuff. A specific request
> I wish to implement here are the jargon synonyms that are heavily
> used.
> 
> Of course, I can simply go ahead and create my own synonym dictionary
> with a jargon specific synonym file to feed it. However, most of the
> synonyms are comprised out of more then a single word. 
> 
> The term "heart attack" for example has the following "synonyms":
> 
> - Acute MI
> - MI
> - Myocardial infarction
> 
> As far as I understand it, the tokenizer within PostgreSQL FTS engine
> splits words on spaces to generate tokens which are then proposed to
> each dictionary. I think it is therefor impossible to have "multi-
> word synonyms" in this sense as multiple words cannot reach the
> dictionary. The term "heart attack" would be presented as the tokens
> "heart" and "attack".
> 
> From a technical standpoint I understand FTS is about looking at
> individual words and lexemizing them ... yet from a natural language
> lookup perspective you still wish to tie "Heart attack" to "Acute MI"
> so when a client search on one, the other will turn up as well.
> 
> Should I write my own tokenizer to catch all these words and present
> them as a single token? Or is this completely outside the realm of
> FTS (or FTS within Postgresql)?
> 
> Cheers,
> Tim
> 
> 


Looking at this from an entirely different perspective, why are you not
using ICD codes to identify patient events?
It is a one to many relationship between patient and their events
identified by the relevant ICD code and date.
Given that MI has several applicable ICD codes you can use a select
along the lines of:-
WHERE icd_code IN (  . . . )


I know it doesn't answer your question!

Cheers,
Rob


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux