HTML text extraction

leledumbo <leledumbo_cool@xxxxxxxxxxx> · Tue, 18 Aug 2009 01:37:41 -0700 (PDT)

Usually, a website gives preview of its articles by extracting some of the
first characters. This is easy if the article is a pure text, but what if
it's a HTML text? For instance, if I have the full text:

<p>
  bla bla bla
  <ul>
    <li>item 1</li>
    <li>item 2</li>
    <li>item 3</li>
  </ul>
</p>

and I take the first 40 characters, it would result in:

<p>
  bla bla bla
  <ul>
    <li>item

As you can see, the tags are incomplete and it might break other texts below
it (I mean, other than this preview). I need a way to solve this problem.

-- 
View this message in context: http://www.nabble.com/HTML-text-extraction-tp25020687p25020687.html
Sent from the PHP - General mailing list archive at Nabble.com.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php