Ron Rademaker wrote:
Nathan Rixham wrote:
Ron Rademaker wrote:
Hi,
I'm trying to load an external rss feed into a DomDocument, the feed
says it's uft-8 but DomDocument rightly disagrees. This causes a
warning: Input is not proper UTF-8, indicate encoding ! Bytes: 0xA3
0x36 0x35 0x20
Is there any way I can get DomDocument to skip that part of the feed.
I'm not looking for the LIBXML_NOERROR option which results in not
loading the feed at all. The feed has some incorrectly encoded pound
sign in it that causes the problem. I'd much rather have the feed
just without that pound sign (or maybe some weird character) than not
having the feed at all.
Thanks,
Ron
it's probably in latin-1; try running it through utf8_encode first :)
That's not really an option when users can define their own RSS feeds,
how can I tell which ones actually do as they say and which ones don't?
shameless plug; a few months ago I made a script rss_php which handles
all of this for you; it's commercial however the classes which cover the
encoding side are freely available here:
http://www.phpclasses.org/browse/package/4393.html
whether using them or simply checking the source, you'll find the
solution :) [i hope! - works for me and 17k other people]
Regards!
--
nathan ( nathan@xxxxxxxxxxx )
{
Senior Web Developer
php + java + flex + xmpp + xml + ecmascript
web development edinburgh | http://kraya.co.uk/
}
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php