On Tue, 2015-10-20 at 19:35 +0900, Tim van der Linden wrote: > Hi All > > I have a question regarding PostgreSQL's full text capabilities and > (presumably) the synonym dictionary. > > I'm currently implementing FTS on a medical themed setup which uses > domain specific jargon to denote a bunch of stuff. A specific request > I wish to implement here are the jargon synonyms that are heavily > used. > > Of course, I can simply go ahead and create my own synonym dictionary > with a jargon specific synonym file to feed it. However, most of the > synonyms are comprised out of more then a single word. > > The term "heart attack" for example has the following "synonyms": > > - Acute MI > - MI > - Myocardial infarction > > As far as I understand it, the tokenizer within PostgreSQL FTS engine > splits words on spaces to generate tokens which are then proposed to > each dictionary. I think it is therefor impossible to have "multi- > word synonyms" in this sense as multiple words cannot reach the > dictionary. The term "heart attack" would be presented as the tokens > "heart" and "attack". > > From a technical standpoint I understand FTS is about looking at > individual words and lexemizing them ... yet from a natural language > lookup perspective you still wish to tie "Heart attack" to "Acute MI" > so when a client search on one, the other will turn up as well. > > Should I write my own tokenizer to catch all these words and present > them as a single token? Or is this completely outside the realm of > FTS (or FTS within Postgresql)? > > Cheers, > Tim > > Looking at this from an entirely different perspective, why are you not using ICD codes to identify patient events? It is a one to many relationship between patient and their events identified by the relevant ICD code and date. Given that MI has several applicable ICD codes you can use a select along the lines of:- WHERE icd_code IN ( . . . ) I know it doesn't answer your question! Cheers, Rob -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general