2008/5/12 Waynn Lue <waynnlue@xxxxxxxxx>: > So if I'm looking to parse certain attributes out of an XML tree, if I > use SAX, it seems that I would need to keep track of state internally. > E.g., if I have a tree like > > <head> > <a> > <b></b> > </a> > <a> > <b></b> > </a> > </head> > > and say I'm interested in all that's between <b> underneath any <a>, > I'd need to have a state machine that looked for an <a> followed by a > <b>. If I'm doing that, though, it seems like I should just start > using a DOM parser instead? Yeah, I think you've got it nailed, although your example is simple enough (you're only holding one state value - "am I a child of <a>?") that I'd probably still reflexively reach for the lightweight solution). I use SAX for lightweight hacks, one step up from regexes - I know the information I want is between <tag> and </tag>, and I don't care about the rest of the document. The more I need to navigate the document, the more likely I am to use DOM. I could build my own data structures on top of a SAX parser, but why bother reinventing the wheel? Of course, you have to factor document size into that - parsing a big XML document into a tree can be slow. You might also want to explore XPath (http://uk.php.net/manual/en/function.simplexml-element-xpath.php http://uk.php.net/manual/en/class.domxpath.php)... XPath is to XML as Regexes are to text files. There's a good chance you'll be able to roll all your parsing up into a couple of XPath queries. I probably should have added that simple parsers come in two flavours - Push Parsers and Pull Parsers. I tend to think (lazily) of Push and Pull as variations on SAX, but strictly speaking they are different. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php