On Apr 26, 2006, at 5:45 AM, Kevin Davies wrote:
Obviously I need to convert these on entry, or on output into RSS.
Does
anyone know of an easy way to do this, or is it a case of
identifying each
unusual character individually?
These high-ascii characters have ord() values greater than 126. If
you're rendering to HTML, you can go through your string converting
them into '&#ord_value;', where `ord_value' is the return from ord()
(so your result looks like "Ò"), which will fix the primary
problem (things breaking) and should at least limit the damage on the
secondary problem (loss of information). In my experience, however,
this will clobber some entities pretty badly. Alternatively, you can
just zap them (into "*" or "~" or some other printable character),
which will work better for text rendering.
You can also mix the two, by identifying individually those
characters that you are concerned with preserving and zapping the
others, e.g.
<?php
/**
* Validate a string as being gremlin-free text. Characters with
ordinal value
* greater than 126 will be converted into the best equivalent.
*
* @param any Something which might be a string.
*
* @returns array|bool True (valid), false (not valid), or an array of
* unconverted exception ordinal values (valid but dirty).
*/
function validate_text( &$text ) {
static $conversions = array(
// Windows & Word
133 => '…'
,145 => '‘'
,146 => '’'
,147 => '“'
,148 => '”'
,149 => '•'
,150 => '–'
,151 => '—'
// Mac
,165 => '•'
,208 => '–'
,209 => '—'
,210 => '“'
,211 => '”'
,212 => '‘'
,213 => '’'
);
if( is_scalar( $text ) || is_null( $text ) ) {
$corpus = str_replace(
array_map( 'chr', array_keys( $conversions ) )
,$conversions
,$text
);
$gremlins = array( );
for( $ii = 0; $ii < strlen( $corpus ); $ii++ ) {
if( ($ordv = ord( $corpus[ $ii ]) ) > 126 ) {
$gremlins[ $ii ] = $ordv;
$corpus[ $ii ] = '*';
}
}
$text = $corpus;
if( count( $gremlins ) ) {
return $gremlins;
}
return true;
}
return false;
}
?>
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php