Search Postgresql Archives

Re: tsearch2 dictionary for statute cites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 10 Mar 2009, Tom Lane wrote:

"Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> writes:
People are likely to search for statute cites, which tend to have a
hierarchical form.  I'm not sure the prefix approach will work for
this.  For example, there is a section 939.64 in the state statutes
dealing with commission of a crime while wearing a bulletproof
garment.  If someone searches for that, they should find subsections
like 939.64(1) or 939.64(2) but not different sections which start
with the same characters like 939.641 (the section on concealing
identity) or 939.645 (the section on hate crimes).  A search for
chapter 939 should return any of the above.

I think what you need is a custom parser that treats these similarly to
hyphenated words.  If I pretend that the dot is a hyphen I get matching
behavior that seems to meet all those requirements.

Unfortunately we don't seem to have any really easy way to plug in a
custom parser, other than copy-paste-modify the existing one which would
be a PITA from a maintenance standpoint.  Perhaps you could pass the
texts and the queries through a regexp substitution that converts
digit-dot-digit to digit-dash-digit?

perhaps, for 8.4 it's better to utilize prefix search, like
to_tsquery('939.645:*') will find what Kevin need. The problem is with parser, so I'd preprocess text before indexing to convert all
digit.digit(digit) to digit.digit.digit, which is what parser recognizes as
a single lexem 'version'.  Here is just an illustration

qq=# select * from ts_parse('default',translate('939.64(1)','()','. '));
 tokid |  token
-------+----------
     8 | 939.64.1
    12 |

btw, having 'version' it's possible to use dict_regex for 8.3.



			regards, tom lane



	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux