Re: caching parsed XML files as DOM objects in memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm writing template
library based on XML. But it's not very efficient to create new
DomDocument, load XML template, process it and show on every page hit.
XML parsing is not very fast, and because I'm parsing XHTML with
entities, all DTD's are parsed too. I thought about something similar
to
java - there I can have servlet which lives all the time the server
lives. It can load XML and parse it only for the first time and send
DOM
objects to another servlets.
I need something similar with PHP, can it be done?


I think you might want to avoid trying to do it the Java way in PHP.

PHP is share-none by architectural design, not accident, so that you
can scale up by throwing as much cheap/stock hardware at it as you can
afford instead of being forced to buy a single bigger hardware box in
the center for the shared data.

It would probably make a lot more sense to store whatever you use to
uniquely identify your XML source and the results in a database or
filesystem, and then compare time-stamps in some simple business logic
to decide to re-parse or serve from cache.

Yes, that does just foist off the shared-data to the database, or
file-system -- but those systems are specifically designed to handle
this task for a long time now with a lot of heavily tested and
optimized code.  PHP and even Java can't really match that level of
testing/optimization yet simply due to relative ages.

If db and filesystem are "too slow" or you already have too many
machines running this code-base, you could write your own PHP "XML
cache server" that takes an XML id and either gets it from the db/file
cache, or parses the true original, and set up your own "server" for
this express purpose and really make it scream on speed...  That's
quite a bit of work, though, and for the simplicity of the code
involved, you may be better off writing it as a C application... Or
out-sourcing that bit of code to be written with specific timing
targets for the employee to meet/beat to get their just due $$$.


Thanks for lot of ideas, you are probably right I'm trying to think it the "java way". But my main bottleneck is the XML parsing part, so I was trying to avoid it somehow. It is also more slower because my XML is not "normal" XML, but XHTML file so I need to have resolveExternals=true to parse files with XHTML entities (  < etc.) So I cannot cache my final objects to files or database, because it involves some sort of serialization and later (when accessing the cache) some unserialization (the slow parse part).
That was the reason I thought about caching in memory.
I'm sure I can setup some "XML cache server" but again, how will I exchange data with it? I cannot move all object trees between servers (XML files couldn't be serialized). There is last chance to write some C extension, but why use PHP then? I can write it all in C, also with my own HTTP server :) I think there should be some way to have all objects (including PHP internal) stored somewhere in memory and "living" all the time the web server lives. It solves many types of problems.

Petr

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux