Extract printable text from web page using preg_match
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
I am trying to write a regex function to extract the readable
(visible, screen-rendered) portion of any web page. Specifically, I
only want the text between the <body> tags, excluding any <script> or
<style> tags within the document, also excluding comments. Has anyone
here seen such a regex? Is it possible to do in one expression?
...Rene
[Index of Archives]
[PHP Home]
[Apache Users]
[PHP on Windows]
[Kernel Newbies]
[PHP Install]
[PHP Classes]
[Pear]
[Postgresql]
[Postgresql PHP]
[PHP on Windows]
[PHP Database Programming]
[PHP SOAP]