Search Postgresql Archives

Re: FTS uses "tsquery" directly in the query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Ivan:

I agree with you and also would like to 'hack' into the code. Current FTC is the best one in database system and a great building block to support more functions. I list some I can think about:
  • choose "|" or "&" as an optional parameter for to_tsquery, to_tsvector.
  • choose normalization or not for to_tsquery, to_tsvector.
  • current two rankings are not enough: the default ts_rank, I have not figured out the algorithm. The ts_rank_cd, we have the paper but it is designed for short query with 2 or 3 tokens.
  • The normalization may be similar to Apache Lucene which is really easy to modify and build your own tokenizer. I still feel confused after reading the annual. 
I am not sure current there is a team to help Oleg Bartunov or not. If need, I can try to do something rather than just hacking it. I am sure, Ivan also will join this. :)
Xu
--- On Mon, 1/25/10, Ivan Sergio Borgonovo <mail@xxxxxxxxxxxxxxx> wrote:

From: Ivan Sergio Borgonovo <mail@xxxxxxxxxxxxxxx>
Subject: Re: FTS uses "tsquery" directly in the query
To: pgsql-general@xxxxxxxxxxxxxx
Date: Monday, January 25, 2010, 4:33 PM

On Mon, 25 Jan 2010 23:35:12 +0300 (MSK)
Oleg Bartunov <oleg@xxxxxxxxxx> wrote:

> Do you guys wanted something like:
>
> arxiv=# select and2or(to_tsquery('1 & 2 & 3'));
>         and2or
> ---------------------
>   ( '1' | '2' ) | '3'
> (1 row)

Nearly. I'm starting from a weighted tsvector not from text/tsquery..
I would like to:
- keep the weights in the query
- avoid parsing the text to extract lexemes twice (I already have a
  tsvector)

For me extending pg in C is a new science, but I'm actually trying
to write at least a couple of functions that:
- will return a tsvector as a weight int, pos int[], lexeme text
  record
- will turn a tsvector + operator into a tsquery
  'orange':A1,2,3 'banana':B4,5 'tomato':C6,7 ->
  'orange':A | 'banana':B | 'tomato':C
  or eventually
  'orange':A & 'banana':B & 'tomato':C

thanks

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux