Please do not top-post.
Hello,I'm trying to implement a keyword-based search on my database.I have some text input (Say, "The A or B and C with D") and a list of keywords ("B","X","Y"). I want to identify which keywords are present in the text input. I also want to use a thesaurus to process the text input, since the keywords have synonyms (for example, "The A : X").As far as I understand, the only way for me to process text input with a thesaurus would be to use tsvector / tsquery; or at least this is the fastest way.Is it possible to retrieve the tsvector from the text input, then retrieve the tsvector from the keyword list, and identify which words are present in both?From its description the tsvector_to_array function sounds like it would be helpful here.tsvector_to_array(input) -> unnestintersecttsvector_to_array(keywords) -> unnestDavid J.
Thank you!It seems like this function's not available in PostgreSQL 9.5 that I'm usingI'm getting: "ERROR: function tsvector_to_array(unknown) does not exist".Do you know of any equivalent function in earlier releases?I was thinking of casting ts_vector into text, and then splitting the text using a delimiter; but this seems like a slow solution and I was wondering if there's anything better.
Not that I can see nor would expect given someone to the time to write it for 9.6 (sorry, about not pointing that out - blindly suggesting solutions only available in beta is not something I usually do).
Your best bet I'd say is using "strip(tsvector)" to clean the inputs, then use "string_to_array(...)" to get your array and then unnest/intersect
There may be other solutions available in 3rd-party extensions or that I'm just missing but just using standard 9.5 stuff that's what I see.
I'd solve your problem using a custom function and if you indeed feel it is too slow then explore alternatives.
David J.