On 3/28/2010 8:05 PM, Skip Evans wrote:
Hey all,
What's the best way to filter/convert characters that don't
translate properly from say news stories to HTML?
For example, I have a form that people cut and paste the lead
in paragraph from news stories they want to link to from their
sites to the original. And of course things like long dashes,
double quotes, single quotes, etc, always translate is wacky
unprintables when they are rendered, and the user needs to
edit them to replace them with standard characters.
Is there way to filter this text through a function that will
convert them to web friendly chars?
Thanks,
Skip
Here's how I handle the problem:
//region***** Translate table for dumb Windows chars when user pastes from Word;
function strips all >160
$win1252ToPlainTextArray = array(
chr(130) => ',',
chr(131) => '',
chr(132) => ',,',
chr(133) => '...',
chr(134) => '+',
chr(135) => '',
chr(139) => '<',
chr(145) => '\'',
chr(146) => '\'',
chr(147) => '"',
chr(148) => '"',
chr(149) => '*',
chr(150) => '-',
chr(151) => '-',
chr(155) => '>',
chr(160) => ' ',
);
//endregion
function cleanWin1252Text($str, $win1252ToPlainTextArray)
{
$str = strtr($str, $win1252ToPlainTextArray);
$str = trim($str);
$patterns = array('%[\x7F-\x81]%', '%[\x83]%', '%[\x87-\x8A]%',
'%[\x8C-\x90]%', '%[\x98-\xff]%');
return preg_replace($patterns, '', $str); //Strip
}
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php