Re: Filter MS Word Garbage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 12 Sep 2006 15:02:08 -0700
Kevin Murphy <php@xxxxxxxxxxxxxxxxxx> wrote:

> I have a web form where trusted people will be inputing information  
> that they usually copy/paste out of MS Word. While the people are  
> trusted... MS Word isn't. I keep getting garbage characters in there,  
> usually associated with Smart Quotes. If I take the content out of  
> the DB, throw it into BBEdit and use the convert to ASCII command  
> that solves the problem.

Iterate of each character (byte) and use chr(b) to get the numeric
value. If that value is less than 128 the character is ASCII. Otherwise
it is not. Based on that, it would be very easy to write a function to
strip all non-ASCII characters.

However, you might consider giving people back exactly what they
submitted. Meaning when storing the fragment use mysql_escape_string()
or equivalent and then when the HTML field use htmlentities() to escape
any special HTML character. That might preserve formatting information
embedded in the clipboard fragment (if that's something you want).

Mike

-- 
Michael B Allen
PHP Active Directory SSO
http://www.ioplex.com/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux