Mark Stosberg wrote:
Joshua D. Drake wrote:
Madison Kelly wrote:
Hi all,
I am asking in this list because, at the end of the day, this is a
performance question.
I am looking at writing a search engine of sorts for my database. I
have only ever written very simple search engines before which amounted
to not much more that the query string being used with ILIKE on a pile
of columns. This was pretty rudimentary and didn't offer anything like
relevance sorting and such (I'd sort by result name, age or whatnot).
So I am hoping some of you guys and gals might be able to point me
towards some resources or offer some tips or gotcha's before I get
started on this. I'd really like to come up with a more intelligent
search engine that doesn't take two minutes to return results. :) I
know, in the end good indexes and underlying hardware will be important,
but a sane as possible query structure helps to start with.
See search.postgresql.org, you can download all source from
gborg.postgresql.org.
Joshua,
What's the name of the project referred to? There's nothing named
"search" hosted on Gborg according to this project list:
http://gborg.postgresql.org/project/projdisplaylist.php
Madison,
For small data sets and simpler searches, the approach you have been
using can be appropriate. You may just want to use a small tool in a
regular programming language to help build the query. I wrote such a
tool for Perl:
http://search.cpan.org/~markstos/SQL-KeywordSearch-1.11/lib/SQL/KeywordSearch.pm
For large or complex searches, a more specialized search system may be
appropriate. I suspect that's kind of tool that Joshua is referencing.
Mark
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
Thanks Joshua and Mark!
Joshua, I've been digging around the CVS (web) looking for the search
engine code but so far have only found the reference (www.search) in
'general.php' but can't locate the file. You wouldn't happen to have a
direct link would you?
Mark, Thanks for a link to your module. I'll take a look at it's
source and see how you work your magic. :)
I think the more direct question I was trying to get at is "How do
you build a 'relavence' search engine? One where results are
returned/sorted by relevance of some sort?". At this point, the best I
can think of, would be to perform multiple queries; first matching the
whole search term, then the search term starting a row, then ending a
row, then anywhere in a row and "scoring" the results based on which
query they came out on. This seems terribly cumbersome (and probably
slow, indexes be damned) though. I'm hoping there is a better way! :)
Madi