Hi Marek, Thank you for the solution. -- Roger Quoting Marek Kilimajer <lists@xxxxxxxxxxxxx>: > That's because the character data is split on the borders of the > entities, so for > > http://feeds.example.com/?rid=318045f7e13e0b66&cat=48cba686fe041718&f=1 > > characterData() will be called 5 times: > > http://feeds.example.com/?rid=318045f7e13e0b66 > & > cat=48cba686fe041718 > & > f=1 > > Solution is inlined below > > Roger Thomas wrote: > > I have a short script to parse my XML file. The parsing produces no error > and all output looks good EXCEPT url-links were truncated IF it contain the > '&' characters. > > > > My XML file looks like this: > > --- start of XML --- > > <?xml version="1.0" encoding="iso-8859-1"?> > > <rss version="2.0"> > > <channel> > > <title>Test News .Net - Newspapers on the Net</title> > > <copyright>Small News Network.com</copyright> > > <link>http://www.example.com/</link> > > <description>Continuously updating Example News.</description> > > <language>en-us</language> > > <pubDate>Tue, 29 Mar 2005 18:01:01 -0600</pubDate> > > <lastBuildDate>Tue, 29 Mar 2005 18:01:01 -0600</lastBuildDate> > > <ttl>30</ttl> > > <item> > > <title>Group buys SunGard for US$10.4bil</title> > > > <link>http://feeds.example.com/?rid=318045f7e13e0b66&cat=48cba686fe041718&f=1</link> > > <description>NEW YORK: A group of seven private equity investment firms > agreed yesterday to buy financial technology company SunGard Data Systems Inc > in a deal worth US$10.4bil plus debt, making it the biggest > lev...</description> > > <source url="http://biz.theexample.com/">The Paper</source> > > </item> > > <item> > > <title>Strong quake hits Indonesia coast</title> > > <link>http://feeds.example.com/news/world/quake.html</link> > > <description>a "widely destructive tsunami" and the quake was > felt as far away as Malaysia.</description> > > <source url="http://biz.theexample.com.net/">The Paper</source> > > </item> > > <item> > > <title>Final News</title> > > <link>http://feeds.example.com/?id=abcdef&cat=somecat</link> > > <description>We are going to expect something new this weekend > ...</description> > > <source url="http://biz.theexample.com/">The Paper</source> > > </item> > > </channel> > > </rss> > > --- end of XML --- > > > > For the sake of testing, my script only print out the url-link to those > news above. I got these: > > f=1 > > http://feeds.example.com/news/world/quake.html > > cat=somecat > > > > The output for line 1 is truncated to 'f=1' and the output of line 3 is > truncated to 'cat=somecat'. ie, the script only took the last parameter of > the url-link. The output for line 2 is correct since it has NO parameters. > > > > I am not sure what I have done wrong in my script. Is it bcos the RSS spec > says that you cannot have parameters in URL ? Please advise. > > > > -- start of script -- > > <? > > $file = "test.xml"; > > $currentTag = ""; > > > > function startElement($parser, $name, $attrs) { > > global $currentTag; > > $currentTag = $name; > > } > > > > function endElement($parser, $name) { > > global $currentTag, $TITLE, $URL, $start; > > > > switch ($currentTag) { > > case "ITEM": > > $start = 0; > > case "LINK": > > if ($start == 1) > > #print "<A HREF = \"".$URL."\">$TITLE</A><BR>"; > > print "$URL"."<BR>"; > > break; > > } > > $currentTag = ""; > > // Reset also other variables: > $URL = ''; > $TITLE = ''; > > > } > > > > function characterData($parser, $data) { > > global $currentTag, $TITLE, $URL, $start; > > > > switch ($currentTag) { > > case "ITEM": > > $start = 1; > > case "TITLE": > > $TITLE = $data; > > // append instead: > $TITLE .= $data; > > > break; > > case "LINK": > > $URL = $data; > > // append instead: > $URL .= $data; > > // Warning: entities are decoded at this point, you will receive &, not > & > > > break; > > } > > } > > > > $xml_parser = xml_parser_create(); > > xml_set_element_handler($xml_parser, "startElement", "endElement"); > > xml_set_character_data_handler($xml_parser, "characterData"); > > > > if (!($fp = fopen($file, "r"))) { > > die("Cannot locate XML data file: $file"); > > } > > > > while ($data = fread($fp, 4096)) { > > if (!xml_parse($xml_parser, $data, feof($fp))) { > > die(sprintf("XML error: %s at line %d", > > xml_error_string(xml_get_error_code($xml_parser)), > > xml_get_current_line_number($xml_parser))); > > } > > } > > > > xml_parser_free($xml_parser); > > > > ?> > > -- end of script -- > > > > TIA. > > Roger > > > > > > --------------------------------------------------- > > Sign Up for free Email at http://ureg.home.net.my/ > > --------------------------------------------------- > > > > --------------------------------------------------- Sign Up for free Email at http://ureg.home.net.my/ --------------------------------------------------- -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php