Processing a fetched external page

"Donovan Hutchinson" <djpreach@xxxxxxxxxxx> · Thu, 18 Mar 2004 18:28:55 -0000

Hi,

I'm working on a project that takes the content of a URL and does stuff with the content. I've managed to extract the target url's html, and am using str_replace to fix links, stylesheets etc. However, i'm stumped when it comes to processing the text content.

Would anyone know how to isolate displayed text (anything in the body, paragraph text, headings etc) and then manipulate this text on a word by word basis?

Any suggestions appreciated,

Don