The code snippet below worked! Thank you so much for your time helping me with this! On Sun, Aug 16, 2009 at 11:26 AM, Ralph Deffke <ralph_deffke@xxxxxxxx>wrote: > this worked here: > <?php > > $html = new DOMDocument(); > $html->loadHtmlFile("testHtml.html"); > $links = $html->getElementsByTagName('a'); > echo "<pre>"; > > foreach ($links as $item) { > echo $item->getAttribute( 'href' ). "\n"; > echo "-------" . $item->nodeValue . "\n"; > } > > echo "</pre>"; > > ?> > > Im sending u the 2 files directly in a minute. it came out, as I thought > earlier that u have to check if the <a> tags has got children to extract > image links. > > ralph_deffke@xxxxxxxx > > > "chrysanhy" <phplists@xxxxxxxxxxxxxxxx> wrote in message > news:88827b190908160943t2254137fve43771c7e4f8cc18@xxxxxxxxxxxxxxxxx > > WHile waiting for suggestions for extracting the link text from the DOM, > I > > tried a brute force approach using the URLs I had found with > getAttribute(), > > but found myself baffled by my results. I boiled down my issue with this > > approach to the following snippet. > > > > $htmldata =<<<EOB > > > http://www.protools.com/users/user_story.cfm?story_id=1162&lang=1 > ">"Creating > > > > Surround Mixes with Tim Weidner</a>" <img height="11" > > src="new.gif" width="28"> > > - <i>Magnification</i> engineer talks about mixing the album > at > > the > > <i>ProTools</i> site, by Jim Batchco > > http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html">"Don't > > Go" Video</a><a href=" > > > http://fi.soneraplaza.net/kaista/musiq/kaistatv/0,8883,201392,00.html > "></a> > > <img height="11" src="new.gif" width="28"> - Presented by > Beyond > > Music > > (<a > href="http://www.apple.com/quicktime/download/">QuickTime</a> > > > > Required) > > EOB; > > $url = 'http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html'; > > $posn = strpos($url, $htmldata); > > echo "URL |$url| position is |$posn|"; > > > > Running this gives me: > > > > URL |http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html|<http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html%7C>position is > || > > > > I've tried lots of functions, and even regular expressions, but I cannot > get > > the code to find the URL in the HTML. While I still hope for a DOM > solution > > to getting this link text, WHY can't the code find the URL in the HTML > > snippet? > > > > On Sun, Aug 16, 2009 at 9:29 AM, chrysanhy > <phplists@xxxxxxxxxxxxxxxx>wrote: > > > > > I pasted the code exactly as you have it, and I got the following: > > > > > > *Fatal error*: Call to undefined method DOMElement::getContent() > > > > > > I got the same thing with nodeValue(). > > > > > > > > > On Sun, Aug 16, 2009 at 7:35 AM, Ralph Deffke > <ralph_deffke@xxxxxxxx>wrote: > > > > > >> did u try it something like this > > >> > > >> foreach ($links as $link) { > > >> $int_url_list[$i]["href"] = $link->getAttribute( 'href' ); > > >> $int_url_list[$i++]["linkText"] = $link->getContent( ); // > > >> nodeValue(); > > >> } > > >> that should work > > >> > > >> send ur code then please > > >> ralph_deffke@yahoo,de > > >> > > >> > > >> "chrysanhy" <phplists@xxxxxxxxxxxxxxxx> wrote in message > > >> news:88827b190908160033n226b370bqe2ab70732811b27@xxxxxxxxxxxxxxxxx > > >> > I have the following code to extract the URLs from the anchor tags > of > an > > >> > HTML page: > > >> > > > >> > $html = new DOMDocument(); > > >> > $htmlpage->loadHtmlFile($location); > > >> > $xpath = new DOMXPath($htmlpage); > > >> > $links = $xpath->query( '//a' ); > > >> > foreach ($links as $link) > > >> > { $int_url_list[$i++] = $link->getAttribute( 'href' ) . "\n"; } > > >> > > > >> > If I have a link <a href="http://X.com">YYYY</a>, how do I extract > the > > >> > corresponding YYYY which is displayed to the user as the text of the > > >> link > > >> > (if it's an image tag, I would like a DOMElement for that). > > >> > Thanks > > >> > > > >> > > >> > > >> > > >> -- > > >> PHP General Mailing List (http://www.php.net/) > > >> To unsubscribe, visit: http://www.php.net/unsub.php > > >> > > >> > > > > > > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >