Re: XML Parsing, starting out.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



sinai@xxxxxxxxx escribió:

Hello, 1st msg here so im excited what results i will get :)

Im a novice with php parsing and have been able to help myself quite well with what ive needed with xml_parse_into_struct and then echoing the fixed variable names but now im dealing with an xml document that is 1st quite large and 2nd changing gradually over time with either added data or changed information. (But not in large numbers)

http://www.camelotherald.com/xml/spells-si.xml

It is a spell listing for an online mmorpg. It has a "spell_list" that captures all other tags and a "spell_line" for each spell line. Within each spell line is a "spell" which has the detailed info for each spell with in the "spell_line"

I could parse this by echoing the fixed variables but due to its size and the changes done over time its just to much work. This is why i come to a stop. Ive read some guides and tried some example code that should help me through this.

What i hoped to get is an example with the xml i provided so i could work with it to complete the project.

The example could look something like this:

Header with the selected "spell_line" name ie. "Calefaction" and below that the spell info in some dummy table format. Then the next "spell_line" which is "Path of Earth" then with the spell info in some dummy table setup.

Thanks for reading and i hope someone can show me the light of a real xml parse.

Hi,

Some months ago I was looking for some code for "loading" dynamic xml data structures my php scripts. I didn't found anything simple and flexible for my needs.

I found handy the PEAR XML_beautifier tokenizer, and then I write a simple function to extract all the data with the right array structure as needed for my project. A recursive function to extract tag elements data from the tokenized arrays (attached below)

Have a look at the code below. In some lines I process part of the contents from your xml sample file, so you can have something to get started, it you find handy my approach.

I've attached a partial output from the data extracted by this code too.

To run the code you need PEAR and XML_Beautifier and dependencies (if any, now don't remember if XML_parser is in the PEAR base install or not).

Have fun, :-)

Gonzalo Monzón

<?php
ini_set('include_path',"./pear;../");

require_once 'PEAR.php';
require_once 'xml/Beautifier/Tokenizer.php';

$xmlsrc = "http://www.camelotherald.com/xml/spells-si.xml";;

$data = XML_tokenize($xmlsrc,True);
$spell_list = _gxml_bds_extract_tag('spell_list','children',$data);
// Extract spell_lines attribute data:
$spell_lines_attr = _gxml_bds_extract_tag('spell_line','attribs',$spell_list,1,-1);
// Extract spell_lines children data:
$spell_lines_data = _gxml_bds_extract_tag('spell_line','children',$spell_list,1,-1);
// Extract spell data and attributes from spell_lines:
foreach($spell_lines_data as $k => $val) {
$spells_attr[$k] = _gxml_bds_extract_tag('spell','attribs',$val,1,-1); $spells_data[$k] = _gxml_bds_extract_tag('spell','children',$val,1,-1); // Extract data from all spells in $k spell_line:
   foreach($spells_data[$k] as $k2 => $val2) {
$tmp = _gxml_bds_extract_tag('level','children',$val2); $spells_data_val[$k][$k2]['level'] = $tmp[0]['data']; $tmp = _gxml_bds_extract_tag('damage_type','children',$val2); $spells_data_val[$k][$k2]['damage_type'] = $tmp[0]['data'];
   }
}

/*****
(A)
******
*/
echo $spell_lines_attr[0]['name']."</br>";

foreach($spells_attr[0] as $k => $val) {
   echo $k."</br>";
   _print_bDS($val);
   _print_bDS($spells_data_val[0][$k]);
   echo "</br>";
}

/*****
(B)
******
*/

_print_bDS($spells_data[0][0]);


/*
******
Output: (A)
******

Calefaction
0
name => Minor Shield of Magma
desc => Creates a field that damages anyone who attacks the target in melee.
id => 22
target => Realm
level => 1
damage_type => Matter

1
name => Shield of Magma
desc => Creates a field that damages anyone who attacks the target in melee.
id => 23
target => Realm
level => 5
damage_type => Matter

2
name => Greater Shield of Magma
desc => Creates a field that damages anyone who attacks the target in melee.
id => 24
target => Realm
level => 9
damage_type => Matter

...

******
Output: (B)
******

0
type => 1
data =>
depth => 3



1
type => 2
tagname => level

attribs

   contains => 1
   depth => 3

children

0
   type => 1
   data => 1
   depth => 4







2
type => 1
data =>
depth => 3



3
type => 2
tagname => range

attribs

   contains => 1
   depth => 3

children

0
   type => 1
   data => 1000
   depth => 4







4
type => 1
data =>
depth => 3



5
type => 2
tagname => damage

attribs

   contains => 1
   depth => 3

children

0
   type => 1
   data => 0.7
   depth => 4







6
type => 1
data =>
depth => 3



7
type => 2
tagname => damage_type

attribs

   contains => 1
   depth => 3

children

0
   type => 1
   data => Matter
   depth => 4


...

*/



function XML_tokenize($data,$file=False) {
   $p = &new XML_Beautifier_Tokenizer("iso-8859-1");
   return $p->tokenize($data,$file);
}


// Recursive function to extract an element when found the tag
// from XML tokenized data
// If offset supplied, search for the N occurrence of the element
// Default will return only one tag struct, or you can use a -1 count
// as to get all the occurrences of that tag, in the whole data

function _gxml_bds_extract_tag($tag,$elem,$data,$offset=1,$count=1,$fromself=False) {
   global $_extract_offset_cnt;
   global $_extract_occurrence_cnt;
   global $_extract_data;

   // If not a recursive call: set offset counter
   if (!$fromself) {
       $_extract_offset_cnt = 0;
       $_extract_occurrence_cnt = 0;
       $_extract_data = array();
   }

   // Search for tokenized tags in this level:
   $found = False;
   do {
$key = key($data); $val = $data[$key]; // Value is an array, recusive call to itself
       if (is_array($val)) {

$found = _gxml_bds_extract_tag($tag,$elem,$val,$offset,$count,True);
           // If we have any data,
           if (is_array($found))
               // If reached needed count: return
               // Ckeck with equal, as <= comparison will fail if -1
               if (($count == $_extract_occurrence_cnt) || ($count == 1))
                   return $found;
} elseif (($key == "tagname") and ($val == $tag)) { $_extract_offset_cnt +=1; if ($count == 1) {
               // We only want one tag occur., return it
               // Check anyway if we succed the initial offset,
               // otherwise continues...
if ($offset == $_extract_offset_cnt) { return $data[$elem];
               }
} else {
               // Want more than one tag (count > 1 or -1 for all)
               // Chek first for initial offset to increment occur. counter
               if ($offset <= $_extract_offset_cnt) {
                   $_extract_occurrence_cnt +=1;
// Did we reached desired count?
                   // if count is minor than occurences, add it,
                   // else return all the data.
                   if ($count <= $_extract_occurrence_cnt) {
                       $_extract_data[] = $data[$elem];
                   } else {
                       // We did reached the count:
                       return $_extract_data;
                   }
}
           }

       }
// Check if do we can stop tokens iteration: // Case only one occurence: if found, solved
       // Case more occurences: we will continue looping
       // til get data, if count reached, yet solved.
} while(next($data)); if ($count == 1) {
       // Not fount in this level, ok.
       return False;
   } else
       // Maybe have got some occurences, but didnt reach counter.
       // In the case, would return all data in global if any
       if (sizeof($_extract_data) > 0)
           return $_extract_data;
       else
           return False;
}

/**
* Debug helper function, print data struc recusively in HTML
*/
function _print_bDS($data) {
   foreach($data as $k => $v) {
       if (is_array($v)) {
           echo "<br><b>".htmlentities($k)."</b><ul>";
           _print_bDS($v);
           echo "</ul><br>";
       } else {
           echo htmlentities($k)." => ".htmlentities($v)."<br>";
       }
}
}

?>

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux