On Fri, 29 Feb 2008, Richard Huxton wrote:
Oleg Bartunov wrote:
On Thu, 28 Feb 2008, Richard Greenwood wrote:
So far my best idea is to create a tsvector column containing both
padded and non-padded versions of the value. i.e. put both R1234 and
R0001234 into the tsvector column. This seems pretty brute force, and
I am pretty new to text search, so I'd welcome any suggestions.
create your dictionary, which index R0001234 as R0001234 and R1234
Seems, dict_regex is your friend.
http://vo.astronet.ru/arxiv/dict_regex.html
Nice - I was thinking something like that would be useful, but Googling
hadn't found me anything. Thanks for that link Oleg.
Wouldn't it be more efficient to have the regex-dictionary map just to R1234
though? Or R0001234, I suppose.
sure. But having both variants in index allows more flexible searches using
different configurations with/without mapping. Thinks about 'exact' search.
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq