Hi Ashley. That's what I'm doing, but want to limit what gets stored
to just the content. Every page has a drop-down menu of, for
instance, various cities, so I want to exclude that part of the page
in my search terms. I don't want every page to think it has content
on Calgary, Hamburg, Phoenix, etc.
I finally walked myself through the code I had been modifying, and
found that they were parsing it line by line with an fgets (rather
than word by word). So was able to use Stuart's original idea of
surrounding the content with an unique comment line, and setting a
flag to tell whether to parse the line or not. If it found the <!--
start_search --> line, it would start parsing, and stop again when it
reached the <!-- stop_search --> line.
Is also nice that I can then specify multiple sections within the
page if I need to.
On 26-Mar-09, at 6:22 PM, Ashley Sheridan wrote:
What about storing all of the page content in the database to start
with, then searching with a mysql statement is a breeze!
On Thu, 2009-03-26 at 16:29 -0600, George Langley wrote:
----- Original Message -----
From: Stuart <stuttle@xxxxxxxxx>
You can't have any extra info in a closing HTML tag. This
problem is
usually handled using comments. Something like the following...
<div id="divContent">
<!-- content begin -->
sofihsod hiosdh sdh gus us u sg
<!-- content end -->
</div>
You then just start with you see the begin comment and stop
when you
hit the end comment.
---------
Hmm, they are stripping out the tags before looking at the words,
so didn't work quite as I originally thought.
The solution seems to be to explode the string based on the
entire comment before doing the word-by-word storing. I wrote up
the following test code that seems to work and handles any string
or error I could think of. Am wondering if this is good or is
there better/more efficient code?
<?php
function contentString($pString, $pStart, $pStop){
echo "$pString<br />";
$finalArray = array();
$finalString = "";
$exploded1 = explode($pStart, $pString); // makes array $exploded1
for ($i=1; $i<count($exploded1); $i++) // ignore first item (0)
in array
{
$exploded2 = explode($pStop, $exploded1[$i]);
array_push($finalArray, $exploded2[0]); // array of just the
wanted sections
}
foreach ($finalArray as $value3)
{
$finalString .= $value3 . " "; // " " ensures separation between
substrings
}
$finalString = trim($finalString); // trim any extra white space
from beginning/end
echo $finalString;
echo "<br /><br />";
}
// TEST
$startTerm = "START";
$stopTerm = "STOP";
// test typical string
$theString = "one two START three four STOP five six START seven
eight STOP nine ten";
contentString($theString, $startTerm, $stopTerm); // outputs
"three four seven eight"
// test string with immediate START
$theString = "START one two STOP three four START five six STOP
seven eight START nine ten";
contentString($theString, $startTerm, $stopTerm); // outputs "one
two five six nine ten"
// test string with "error" (2 STARTS)
$theString = "START one two START three four STOP five six START
seven eight STOP nine ten";
contentString($theString, $startTerm, $stopTerm); // outputs "one
two three four seven eight"
// test string with no space between separators and real content
$theString = "STARTone twoSTOP three four STARTfive sixSTOP seven
eight STARTnine ten";
contentString($theString, $startTerm, $stopTerm); // outputs "one
two five six nine ten"
?>
Any thoughts/suggestions? Thanks!
George
Ash
www.ashleysheridan.co.uk
George Langley
Multimedia Developer, Audio/Video Editor, Musician, Arranger, Composer
http://www.georgelangley.ca
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php