Re: Re: How do I extract link text from anchor tag as well as the URL from the "href" attribute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The code snippet below worked! Thank you so much for your time helping me
with this!

On Sun, Aug 16, 2009 at 11:26 AM, Ralph Deffke <ralph_deffke@xxxxxxxx>wrote:

> this worked here:
> <?php
>
> $html = new DOMDocument();
> $html->loadHtmlFile("testHtml.html");
> $links = $html->getElementsByTagName('a');
> echo "<pre>";
>
> foreach ($links as $item) {
>  echo $item->getAttribute( 'href' ). "\n";
>  echo "-------" . $item->nodeValue . "\n";
> }
>
> echo "</pre>";
>
> ?>
>
> Im sending u the 2 files directly in a minute. it came out, as I thought
> earlier that u have to check if the <a> tags has got children to extract
> image links.
>
> ralph_deffke@xxxxxxxx
>
>
> "chrysanhy" <phplists@xxxxxxxxxxxxxxxx> wrote in message
> news:88827b190908160943t2254137fve43771c7e4f8cc18@xxxxxxxxxxxxxxxxx
> > WHile waiting for suggestions for extracting the link text from the DOM,
> I
> > tried a brute force approach using the URLs I had found with
> getAttribute(),
> > but found myself baffled by my results. I boiled down my issue with this
> > approach to the following snippet.
> >
> > $htmldata =<<<EOB
> >
> http://www.protools.com/users/user_story.cfm?story_id=1162&amp;lang=1
> ">&quot;Creating
> >
> >             Surround Mixes with Tim Weidner</a>&quot; <img height="11"
> > src="new.gif" width="28">
> >             - <i>Magnification</i> engineer talks about mixing the album
> at
> > the
> >             <i>ProTools</i> site, by Jim Batchco
> > http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html";>&quot;Don't
> >             Go&quot; Video</a><a href="
> >
> http://fi.soneraplaza.net/kaista/musiq/kaistatv/0,8883,201392,00.html
> "></a>
> >             <img height="11" src="new.gif" width="28"> - Presented by
> Beyond
> > Music
> >             (<a
> href="http://www.apple.com/quicktime/download/";>QuickTime</a>
> >
> >             Required)
> > EOB;
> > $url = 'http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html';
> > $posn = strpos($url, $htmldata);
> > echo "URL |$url| position is |$posn|";
> >
> > Running this gives me:
> >
> > URL |http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html|<http://www.beyondmusic.com/MediaPlayer/Yes/DontGo.html%7C>position is
> ||
> >
> > I've tried lots of functions, and even regular expressions, but I cannot
> get
> > the code to find the URL in the HTML. While I still hope for a DOM
> solution
> > to getting this link text, WHY can't the code find the URL in the HTML
> > snippet?
> >
> > On Sun, Aug 16, 2009 at 9:29 AM, chrysanhy
> <phplists@xxxxxxxxxxxxxxxx>wrote:
> >
> > > I pasted the code exactly as you have it, and I got the following:
> > >
> > > *Fatal error*: Call to undefined method DOMElement::getContent()
> > >
> > > I got the same thing with nodeValue().
> > >
> > >
> > > On Sun, Aug 16, 2009 at 7:35 AM, Ralph Deffke
> <ralph_deffke@xxxxxxxx>wrote:
> > >
> > >> did u try it something like this
> > >>
> > >> foreach ($links as $link) {
> > >>    $int_url_list[$i]["href"] = $link->getAttribute( 'href' );
> > >>    $int_url_list[$i++]["linkText"] = $link->getContent(  ); //
> > >> nodeValue();
> > >> }
> > >> that should work
> > >>
> > >> send ur code then please
> > >> ralph_deffke@yahoo,de
> > >>
> > >>
> > >> "chrysanhy" <phplists@xxxxxxxxxxxxxxxx> wrote in message
> > >> news:88827b190908160033n226b370bqe2ab70732811b27@xxxxxxxxxxxxxxxxx
> > >> > I have the following code to extract the URLs from the anchor tags
> of
> an
> > >> > HTML page:
> > >> >
> > >> > $html = new DOMDocument();
> > >> > $htmlpage->loadHtmlFile($location);
> > >> > $xpath = new DOMXPath($htmlpage);
> > >> > $links = $xpath->query( '//a' );
> > >> > foreach ($links as $link)
> > >> > { $int_url_list[$i++] = $link->getAttribute( 'href' ) . "\n"; }
> > >> >
> > >> > If I have a link <a href="http://X.com";>YYYY</a>, how do I extract
> the
> > >> > corresponding YYYY which is displayed to the user as the text of the
> > >> link
> > >> > (if it's an image tag, I would like a DOMElement for that).
> > >> > Thanks
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> PHP General Mailing List (http://www.php.net/)
> > >> To unsubscribe, visit: http://www.php.net/unsub.php
> > >>
> > >>
> > >
> >
>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux