Re: getting content exceprts from the database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ashley Sheridan wrote:

> Here's the rub though. As the content is in HTML form, I can't just
> grab the first 100 characters and display them as that could leave an
> open tag  without a closing one, potentially breaking the page. I
> could use strip_tags on the 100-character excerpt, but what if the
> excerpt itself broke a tag in half (i.e. <acronym title="something">
> could become <acron )
> 
> The only solutions I can see are:
> 
> 
>       * retrieve the entire article, perform a strip_tags and then
>       take the excerpt
>       * use a regex inside of mysql to pull out only the text
> 

- parse the HTML and extract the text elements.

If the HTML is well-formed, this is relatively easily done with XSL, if
not, you might need to use Beautiful Soup or similar.



-- 
Per Jessen, Zürich (16.1°C)


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux