This may actually be a MySQL question... Or not. I'm scraping about 55,000 pages from a website into a MySQL database. Some of these pages have "extended ASCII" values in their content, or, in some cases, just plain junk ASCII values, as far as I can tell. For example, decimal 163 is sometimes used to represent the UK monetary symbol for a Pound. Unforunately, when I insert/update the text into the database, the text is chopped off, as far as I can tell, at any extended ASCII value. Now, I've set things up for UTF-8, expressly to avoid this kind of problem, I thought: var_dump(mysql_get_server_info($connection)); var_dump(mysql_get_client_info()); var_dump(mysql_client_encoding($connection)); string(10) "5.0.26-log" string(6) "5.0.26" string(4) "utf8" I'm open to any advice about the correct solution to convert "extended ASCII" as typed in emails by tens of thousands of users on diverse systems, from diverse countries... Note that "extended ASCII" is inconsistent from Microsoft to Apple to Unix to ..., as far as I understand it, so I really can't be sure which charset the original user was using. Of course, for now, I'll just change any extended ASCII into space and move on with life... But that is not what I would like to end up with in the long run. And why is MySQL not just taking these extended ASCII chars in the first place? Seems to me that UTF-8 encoding should accept them, no? Disclaimer: I am so NOT hip to this encoding stuff... -- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some indie artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php