Re: keyword search help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Here's my implementation of keyword searching, done in
pure sql and php.
My plan was to have specific document types in their own
specific directory (i.e. /pdfs /doc /etc.).  The design has
changed somewhat but this should give a better idea
of how to do keyword searching.

If you're comfortable with unix and can usually get
software to work, you might look into Open Full Text
Search < http://openfts.sourceforge.net >.  I don't think
it has a php extension, but you may want to look at it.
There are also several good search engines available, a
list of which is provided at Search Tools
http://www.searchtools.com/tools/tools-opensource.html


<---------- Postgresql description ------------->

CREATE TABLE doclist (
  doc_id       int not null,
  doc_summary  varchar(50),
  doc_keywords char(200),
  doc_type     int not null
)
CREATE TABLE doctype (
  type_id      serial,
  type_name    varchar(35),
  doc_location varchar(36)
)

<---------- End Postgresql description ------------->

The keywords were to be submitted in a form field called
"formvars".  This is somewhat of a hack put together
with tips and suggestions in the pgsql and php mailing
lists.  The result of the bottom script is that you will
have the relevant documents in weighted order.

<---------- PHP description ------------->

$_POST["formvars"]=trim($_POST["formvars"]);
$search_items=explode(" ", $_POST["formvars"]);

// Create query based on search input

switch (count ($search_items))
{
        case 0:
                header('Location:' . $HTTP_SERVER_VARS["HTTP_REFERER"]);
                break;
        case 1:
                $DocQuery="SELECT ";
                $DocQuery.="  ds.doc_summary, ds.doc_id, dt.type_name,
dt.doc_location, 
ds.score ";
                $DocQuery.="FROM ";
                $DocQuery.="  doctype dt, ";
                $DocQuery.="  (select doc_id, doc_summary, doc_type, (";
                $DocQuery.="  (CASE WHEN doc_keywords LIKE '%" .
$search_items[0] . "%' 
THEN 1 ELSE 0 END) ) ";
                $DocQuery.="  AS score FROM doclist) ds ";
                $DocQuery.="WHERE";
                $DocQuery.="  ds.doc_type=dt.type_id AND ";
                $DocQuery.="  ds.score>=1 ";
                $DocQuery.="ORDER BY";
                $DocQuery.="  score DESC;";
                break;
        case 2:
                $DocQuery="SELECT ";
                $DocQuery.="  ds.doc_summary, ds.doc_id, dt.type_name,
dt.doc_location, 
ds.score ";
                $DocQuery.="FROM ";
                $DocQuery.="  doctype dt, ";
                $DocQuery.="  (select doc_id, doc_summary, doc_type, (";
                $DocQuery.="  (CASE WHEN doc_keywords LIKE '%" .
$search_items[0] . "%' 
THEN 1 ELSE 0 END) + ";
                $DocQuery.="  (CASE WHEN doc_keywords LIKE '%" .
$search_items[1] . "%' 
THEN 1 ELSE 0 END) ) ";
                $DocQuery.="  AS score FROM doclist) ds ";
                $DocQuery.="WHERE";
                $DocQuery.="  ds.doc_type=dt.type_id AND ";
                $DocQuery.="  ds.score>=2 ";
                $DocQuery.="ORDER BY";
                $DocQuery.="  score DESC;";
                break;
        default:
                $arraycount=count ($search_items);
                $i=0;
                $DocQuery="SELECT ";
                $DocQuery.="  ds.doc_summary, ds.doc_id, dt.type_name,
dt.doc_location, 
ds.score ";
                $DocQuery.="FROM ";
                $DocQuery.="  doctype dt, ";
                $DocQuery.="  (select doc_id, doc_summary, doc_type, (";
                while($i < ($arraycount-1))
                {
                        $DocQuery.="  (CASE WHEN doc_keywords LIKE '%" .
$search_items[$i] . "%' 
THEN 1 ELSE 0 END) + ";
                        $i++;
                }
                $DocQuery.="  (CASE WHEN doc_keywords LIKE '%" . 
$search_items[$arraycount-1] . "%' THEN 1 ELSE 0 END) ) ";
                $DocQuery.="  AS score FROM doclist) ds ";
                $DocQuery.="WHERE";
                $DocQuery.="  ds.doc_type=dt.type_id AND ";
                $DocQuery.="  ds.score>=" . ($arraycount-1) . " ";
                $DocQuery.="ORDER BY";
                $DocQuery.="  score DESC;";
}

// Search database

$result = pg_Exec($conn, $DocQuery);
if (!$result)
{
                exit();
}
else
        {
        $num=pg_numrows($result);
        if ($num<>0)
        {
                echo "<h2>$num record(s) found.</h2>";
                $i=0;
                while($i < $num)
                {
                        @$row=pg_fetch_array($result,$i);
                        echo "<br>\n";
                        echo "<a href=\"" . $row["doc_location"] .
$row["doc_id"] . 
".pdf\"><strong>" . $row["doc_id"] ."</strong></a>";
                        echo "<img src=\"/images/pdf.gif\" width=\"20\"
height=\"22\" border=\"0\" 
alt=\"PDF document\"><br>\n";
                        echo "<strong>Category: </strong>" .
$row["type_name"] ."<br>\n";
                        echo "<strong>Summary: </strong>" .
$row["doc_summary"] ."<br>\n";
                        echo "</br>\n";
                        $i++;
                }
        }
        else
        {
                echo "<h2>No records found.</h2>";
        }


<---------- End PHP description ------------->

The above query, given some keywords, would result in something similar
to the following:


       doc_summary      | doc_id | type_name | doc_location  | score
-----------------------+--------+-----------+---------------+-------
  AV195 - Exploded view |   1002 | VACUUMS   | /techdocs/av/ |     3
  AV395 - Exploded view |   1003 | VACUUMS   | /techdocs/av/ |     2
(2 rows)


..which php then proceeds to list in ordered form.  In any case,
hope this helps, if not, maybe some of the links will prove useful.

do

arun kv wrote:
> 
> thnx for reply sir. actually we are maintaining a cdrom database in
> postgresql wherein we have cd rom titlename,author,year etc and keywords
> as fields. we will store a set of keywords for each record. i want to know
> how to enter those keywords in to database (whether to have comma between
> keywords or plain) and code to search for keywords. i.e. if i enter a
> keyword then it should search for that keyword in keyword field and
> display result. keywords are strings an there will also be some
> combinational search. i believe u will see to this and do the needful.
>   thanking you
>          with rgds
>              Arun
> 
> On Thu, 31 Jan 2002, Josh Berkus wrote:
> 
> > Arun,
> >
> > >  i am arun from iisc,bangalore. we are on a database project and are
> > >  using
> > > postgresql and php. i am finding some problems in keyword search. we
> > >  have
> > > to store some keywords in a field and when we give some keywords it
> > >  should
> > > search for the same in that field. how to store in database and how
> > >  to
> > > access is the problem. pls help me in this matter.
> > >   with rgds
> >
> > Unfortunately, your question does not have a simple answer, as there
> >  are several options depending on waht you're searching, how often, how
> >  many users, and what kind of keywords you want to use.
> >
> > What you need is a book on the basics of database design. I do not know
> >  of one that has been translated into Indonesian.   You may check out
> >  lists of English-language books at:
> > http://techdocs.postgresql.org/bookreviews.php
> > and
> > http://www3.us.postgresql.org/books/index.html
> >
> > If English books are no use to you, or exchange rates put them out of
> >  your price range, e-mail me back and I will outline some simple
> >  methods for keyword searching.
> >
> > -Josh Berkus
> >
> > ______AGLIO DATABASE SOLUTIONS___________________________
> >                                        Josh Berkus
> >   Complete information technology      josh@xxxxxxxxxxxx
> >    and data management solutions       (415) 565-7293
> >   for law firms, small businesses        fax 621-2533
> >     and non-profit organizations.      San Francisco
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@xxxxxxxxxxxxxx)


[Index of Archives]     [Postgresql General]     [Postgresql Admin]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Backpacking]     [Postgresql Jobs]

  Powered by Linux