I'm looking for a way to strip HTML tags out of some text content (sourced from a web page) to leave just the text which I'll be running some basic analysis on. The thing is, I want to preserve text that is in alt and title attributes. I can't use any DOM functions, as I can't guarantee that the content will be valid XHTML, although it should be valid HTML. I'm happy doing this with string functions and regular expressions, but I was wondering if something for this already existed? The server I plan on putting this on does not have access to the shell (although it is a Linux server) so I won't be able to have Lynx or Elinks parse the content for me either :( Thanks, Ash http://www.ashleysheridan.co.uk