On Sun, 17 Jan 2010 20:19:59 +0300 (MSK) Oleg Bartunov <oleg@xxxxxxxxxx> wrote: > Ivan, > > You can write function to get lexemes from tsvector: > CREATE OR REPLACE FUNCTION ts_stat(tsvector, weights text, OUT > word text, OUT ndoc integer, OUT nentry integer) > RETURNS SETOF record AS > $$ > SELECT ts_stat('SELECT ' || quote_literal( $1::text ) || > '::tsvector', quote_literal( $2::text) ); $$ LANGUAGE SQL RETURNS > NULL ON NULL INPUT IMMUTABLE; Thanks very much Oleg. Still it is not really making the pain go away. I've weights stored in my tsvector and I need to build the query using them. This means that if I have: 'aubergine':4A 'orange':1B 'banana':5A 'apple':3C and 'coconut':3B 'bananas':1A 'tomatoes:2C stored in a column (tsv) I really would like to build up the query: to_tsquery('aubergine:A | orange:B | bananas:A | apple:C') then tsv @@ to_tsquery('aubergine:A | orange:B | bananas:A | apple:C') and relative ts_rank() I'm aware that it is not symmetrical, but it looks as the cheapest and fastest thing I can do right now. I'm using pg_catalog.english. Am I supposing correctly that NO lexeme will contain spaces? If that is the case I could simply use string manipulation tools. Not nice to see but it will work. > Then, you can create ARRAY like: > > select ARRAY ( select (ts_stat(fts,'*')).word from papers where > id=2); > > Then, you will have two arrays and you're free to apply any > similarity function (cosine, jaccard,....) to calculate what do > you want. If you want to preserve weights, then use weight label > instead of '*'. What ts_rank does is more than enough right now. > Another idea is to use array_agg, but I'm not ready to discuss it. > > Please, keep in mind, that document similarity is a hot topic in Not hard to imagine. > IR, and, yes, I and Teodor have something about this, but code > isn't available for public. Unfortunately, we had no sponsor for > full-text search for last year and I see no perspectives this > year, so we postpone our text-search development. Good luck. Do you have anything like http://www.chipin.com/ for small donations? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general