Re: get content rss feed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 2, 2012 at 7:00 AM, Doeke Wartena <clankill3r@xxxxxxxxx> wrote:
> I try to get the content from the following rss feed
> http://www.adafruit.com/blog/feed/
>
> I want to store it in a database in order to use it for a school assignment.
> If i look in my browser to the feed then i see content and description,
> however if i try to get them with php then description is this:
>
> [description] => SimpleXMLElement Object
>        (
>        )
>
>
> and content is gone.
>
>
> ----------
> $db = dbConnect();
>
> $xml =  getFileContents("http://www.adafruit.com/blog/feed/";);
> $xmlTree = new SimpleXMLElement($xml);
>
> for($i = count($xmlTree->channel->item)-1; $i >= 0; $i--) {
> $item = $xmlTree->channel->item[$i];
>  echo "<pre>";
> print_r($item);
> echo "</pre>";
> }
>
> dbClose($db);
>
> ?>
> ----------
>
> this is 1 part of the print_r:
>
> SimpleXMLElement Object
> (
>    [title] => Birth of the ARM: Acorn Archimedes Promo from 1987
>    [link] => http://www.adafruit.com/blog/2012/04/28/birth-of-the-arm-acorn-archimedes-promo-from-1987/
>    [comments] =>
> http://www.adafruit.com/blog/2012/04/28/birth-of-the-arm-acorn-archimedes-promo-from-1987/#comments
>    [pubDate] => Sat, 28 Apr 2012 04:01:35 +0000
>    [category] => Array
>        (
>            [0] => SimpleXMLElement Object
>                (
>                )
>
>            [1] => SimpleXMLElement Object
>                (
>                )
>
>        )
>
>    [guid] => http://www.adafruit.com/blog/?p=30498
>    [description] => SimpleXMLElement Object
>        (
>        )
>
> )
>
> I guess content is gone cause it's like this:
>
> <content:encoded>
>
> And description is gone cause it's like this:
>
> <![CDATA[
>
> But how can i avoid this problem (i'm quite new)?
>
> bye


Hi, Doeke, welcome to PHP!

RSS feed processing can be it's own special form of hell, as feed
providers often include a whole set of extra namespaces. Luckily, this
doesn't necessarily cause that much of a concern because you can
include them as well.

First, notice the beginning of the sample RSS feed:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
        xmlns:content="http://purl.org/rss/1.0/modules/content/";
        xmlns:wfw="http://wellformedweb.org/CommentAPI/";
        xmlns:dc="http://purl.org/dc/elements/1.1/";
        xmlns:atom="http://www.w3.org/2005/Atom";
        xmlns:sy="http://purl.org/rss/1.0/modules/syndication/";
        xmlns:slash="http://purl.org/rss/1.0/modules/slash/";
        >

You will need to be able to tell your XML parser about all those extra
name spaces in order for it to return useful info. Look at the manual
for the [SimpleXMLElement::getDocNamespaces](http://us.php.net/manual/en/simplexmlelement.getdocnamespaces.php),
which will tell you what it's using.

You'll notice that one of the name spaces above is "content", and if
you look inthe RSS feed, you'll see a
<content:encoded>....</content:encoded> in each feed item. You need to
retrieve this by specifying the content namespace for the element
"encoded" in the item.

To do that, you'll need to register the namespace with the XPath using
[SimpleXMLElement::registerXPathNamespace](http://us.php.net/manual/en/simplexmlelement.registerxpathnamespace.php).
Once you do that, you'll be able to retrieve the content:encoded
element via the xpath method.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux