Strip Tags and Content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi guys I`m trying to retrieve a html page from an url, wich I already done
with the following script:

*

$document* *=* implode*('',* file*('http://www.mysite.net/'));*

**

*Then I need to extract the html tags from it wich I did with the following
script:*
*

$search = array ('@<script[^>]*?>.*?</script>@si', // Strip out javascript
'@<[\/\!]*?[^<>]*?>@si', // Strip out HTML tags
'@([\r\n])[\s]+@', // Strip out white space
'@&(quot|#34);@i', // Replace HTML entities
'@&(amp|#38);@i',
'@&(lt|#60);@i',
'@&(gt|#62);@i',
'@&(nbsp|#160);@i',
'@&(iexcl|#161);@i',
'@&(cent|#162);@i',
'@&(pound|#163);@i',
'@&(copy|#169);@i',
'@&#(\d+);@e'); // evaluate as php

$replace = array ('',
'',
'\1',
'"',
'&',
'<',
'>',
' ',
chr(161),
chr(162),
chr(163),
chr(169),
'chr(\1)');

$text = preg_replace($search, $replace, $document);



My Problem is that I still get a mumbled text wich I'd like to put a
paragraph after every attribute, like after title should be a paragrah,
after the first block of text should be another, and so on. Also the
attributes for the html tags as still showing so I'd also like to remove
these from the results. Thanks for any idea.

Rodrigo
*

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux