On Sat, Oct 10, 2009 at 11:40 PM, James Colannino <james@xxxxxxxxxxxxx> wrote: > > Hey everyone. I'd been troubled for a while by the fact that inserting > cut-pasted special characters such as ä caused truncation when passed to > MySQL, then discovered that it was because I was cutting and pasting unicode > values into non-unicode Latin-1 strings. > > Since Latin-1 also has equivalent values, I was hoping that filtering my mixed > unicode/non-unicode string through utf8_decode() would solve the problem, but > instead, where the unicode character used to be, I now get a '?', followed by a > few characters being taken out of the middle. I'm guessing that this is because > utf8_decode() assumes the whole string is unicode and therefore removes a bunch > of extra bytes from the string and corrupts it. At least, that's my guess. I > could be very wrong (I have pretty much no experience with different character > sets...) > > My question is, what's a good way to translate unicode characters in a > non-unicode string to their Latin-1 equivalents? I need to be able to do this > in order to sanitize a fairly common form of input. > > Thanks! > > James > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > Have you tried iconv or mb_string? Is it a option to update the database to use UTF-8? Andrew -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php