Re: Good XML Parser

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, thanks so much for the help.  I went with DOM-parsing to begin
with, I'll explore XPath + SimpleXML later.

Thanks,
Waynn

On Mon, May 12, 2008 at 5:23 AM, David Otton
<phpmail@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
> 2008/5/12 Waynn Lue <waynnlue@xxxxxxxxx>:
>> So if I'm looking to parse certain attributes out of an XML tree, if I
>> use SAX, it seems that I would need to keep track of state internally.
>>  E.g., if I have a tree like
>>
>> <head>
>>  <a>
>>   <b></b>
>>  </a>
>>  <a>
>>    <b></b>
>>  </a>
>> </head>
>>
>> and say I'm interested in all that's between <b> underneath any <a>,
>> I'd need to have a state machine that looked for an <a> followed by a
>> <b>.  If I'm doing that, though, it seems like I should just start
>> using a DOM parser instead?
>
> Yeah, I think you've got it nailed, although your example is simple
> enough (you're only holding one state value - "am I a child of <a>?")
> that I'd probably still reflexively reach for the lightweight
> solution). I use SAX for lightweight hacks, one step up from regexes -
> I know the information I want is between <tag> and </tag>, and I don't
> care about the rest of the document. The more I need to navigate the
> document, the more likely I am to use DOM. I could build my own data
> structures on top of a SAX parser, but why bother reinventing the
> wheel? Of course, you have to factor document size into that - parsing
> a big XML document into a tree can be slow.
>
> You might also want to explore XPath
> (http://uk.php.net/manual/en/function.simplexml-element-xpath.php
> http://uk.php.net/manual/en/class.domxpath.php)... XPath is to XML as
> Regexes are to text files. There's a good chance you'll be able to
> roll all your parsing up into a couple of XPath queries.
>
> I probably should have added that simple parsers come in two flavours
> - Push Parsers and Pull Parsers. I tend to think (lazily) of Push and
> Pull as variations on SAX, but strictly speaking they are different.
>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux