Re: Extract printable text from web page using preg_match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, February 27, 2007 11:47 am, M5 wrote:
> I am trying to write a regex function to extract the readable
> (visible, screen-rendered) portion of any web page. Specifically, I
> only want the text between the <body> tags, excluding any <script> or
> <style> tags within the document, also excluding comments. Has anyone
> here seen such a regex? Is it possible to do in one expression?

I think http://php.net/striptags may be your best bet...

-- 
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some starving artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux