Search Postgresql Archives

Re: ts_headline

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hmmmm!
I think I now understand the ts position better, thank you.

Part of my problem has been that I am used to the functionality of Open Text's 
LCS (aka BASIS) product which handles text differently.

It includes the position (and context) information in the index and does 
"remember" how the text was parsed so does not need to reparse to insert hit 
navigation tags nor need pointers as to how to parse queries. (It also 
supports phrase searching.)

Now that I have a better understanding of ts, I think I will be able to make 
it do at least most of what I hoped for.

Thank you again for your help with this.

Cheers,
Stephen Davies

On Friday 22 February 2008 20:45, Richard Huxton wrote:
> Stephen Davies wrote:
> > Unfortunately, my link to the box with the test database is down due to
> > lack of maintenance by our local telco (Telstra) but I think that I also
> > missed the optional config arg to ts_headline.
> >
> > The lack of link also means that I cannot confirm your findings but your
> > logic looks good.
>
> Looks like ALTER DATABASE SET default_text_config='english' is what you
> need.
>
> > It begs the question, however, as to why ts-headline needs to reparse the
> > raw text.
>
> It needs to line up tsvector lexemes with actual characters in the text.
> The tsvector is missing punctuation, any stopwords (the, it, a) as well
> as being stemmed (if your dictionary does that).
>
> Also, it's looking for a short span of words that provide the best
> match. That might not be a complete match of course, and is different to
> how you'd normally look to use a tsvector.
>
> > At least in my case, I am using a trigger to parse the combination of
> > Title and Abstract to a ts_vector field in the table row (as suggested in
> > 12.2.2 and 12.4.3 in the doco) so that the ts_vector is already available
> > to ts_headline.
> >
> > If ts_headline had the ability to use that pre-parsed ts_vector, my
> > problem would never have arisen - and the performance of ts_headline
> > would be improved.
>
> Maybe. It would still have to parse the text to some degree though, just
> to get the original words & punctuation into the headline.

-- 
========================================================================
This email is for the person(s) identified above, and is confidential to
the sender and the person(s).  No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies Consulting                            Voice: 08-8177 1595
Adelaide, South Australia.                             Fax: 08-8177 0133
Computing & Network solutions.                       Mobile:0403 0405 83

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux