Re: Processing a fetched external page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Donovan Hutchinson wrote:

I'm working on a project that takes the content of a URL and does stuff with the content. I've managed to extract the target url's html, and am using str_replace to fix links, stylesheets etc. However, i'm stumped when it comes to processing the text content.

Would anyone know how to isolate displayed text (anything in the body, paragraph text, headings etc) and then manipulate this text on a word by word basis?


Sure... use strip_tags()/fgetss() and then some regexps (preg_replace()) to clear up any remaining cruft. If you want, I can post you my own brute-force "get all text from a webpage" script :)

   Bruno Ferreira
---
[This E-mail scanned for viruses by Declude Virus]

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux