Re: Strip Tags and Content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

You can use fgetss() or strip_tags() to take the tags off and html_entity_decode() to transform the HTML entities.

I don't understand what you mean by putting it into paragraphs. Are you talking about rewriting the HTML, or something else?

- Alex "Sunstorm"

On Tue, 28 Mar 2006 15:08:31 +0100, ""Ministério Público"" <arquivomortovirtual@xxxxxxxxx> wrote:

Hi guys I`m trying to retrieve a html page from an url, wich I already done
with the following script:

*

$document* *=* implode*('',* file*('http://www.mysite.net/'));*

**

*Then I need to extract the html tags from it wich I did with the following
script:*
*

$search = array ('@<script[^>]*?>.*?</script>@si', // Strip out javascript
'@<[\/\!]*?[^<>]*?>@si', // Strip out HTML tags
'@([\r\n])[\s]+@', // Strip out white space
'@&(quot|#34);@i', // Replace HTML entities
'@&(amp|#38);@i',
'@&(lt|#60);@i',
'@&(gt|#62);@i',
'@&(nbsp|#160);@i',
'@&(iexcl|#161);@i',
'@&(cent|#162);@i',
'@&(pound|#163);@i',
'@&(copy|#169);@i',
'@&#(\d+);@e'); // evaluate as php

$replace = array ('',
'',
'\1',
'"',
'&',
'<',
'>',
' ',
chr(161),
chr(162),
chr(163),
chr(169),
'chr(\1)');

$text = preg_replace($search, $replace, $document);



My Problem is that I still get a mumbled text wich I'd like to put a
paragraph after every attribute, like after title should be a paragrah,
after the first block of text should be another, and so on. Also the
attributes for the html tags as still showing so I'd also like to remove
these from the results. Thanks for any idea.

Rodrigo
*



--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux