Man-wai Chang wrote: >> On the other hand, I remember you talked about the type of that >> column to be char(2). Have you specified what encoding it's using? >> Moreover, I hope you're not using legacy encoding like Big5 or GB. Use >> Unicode (UTF-8) if your database is a brand new one. >> > > Unfortunately, I am still using Big5. you need a longer field to store > utf-8 codes for the same big5 string right? > Yes. While in Big5 every (Chinese) character is represented by two bytes, every Chinese character represented in UTF-8 uses at least three bytes (in rare occasion, 4 bytes, if very rare characters are used such as those in ancient Chinese). This is because UTF-8 is designed to be 8-bit compatible to old data-processing functions. In other words, for a string containing pure Chinese characters, a UTF-8 one is 150% longer than a Big-5 one. You could, of course, use UTF-16 as the base format for your string. In this case, every character is represented by 2 bytes, be it a Western Latin character or an Eastern CJK character. OK, yes, for rare characters, you would use up to 4 bytes, but this is rare. Anyway, you should look at the positive side of using Unicode instead of the dinosaur encoding, sorry, I mean Big5 :p Hard drives (and RAM) nowadays are getting real big, string size should be considered as a first criterion to choose what encoding to use. Unicode is done by an international consortium and it could support most languages in the world. For instance, using Big5, you can't even represent the simplest of Western European characters like in these words: español or français!! But you could represent them using Unicode. Actually, the ability to represent (Western) European characters might not interest you. But using Unicode, you could store both traditional and simplified Chinese! And this, I'm sure you're interested. You can't do that in Big5, I'm 100% sure! Still not convinced yet. Well, Unicode even contains traditional Chinese characters that Big5 doesn't support. For example, a friend on mine has this character 驊 in his first name. This character isn't supported in Big5 and in pre-Unicode period, he had to type (馬華)! Very stupid! Another example: 氹 is quite a common word in southern China but this character can't be found in Big5. So, think about using Unicode. We are in 2007 and be a modern man! ---------- * Zoner PhotoStudio 8 - Your Photos perfect, shared, organised! www.zoner.com/zps You can download your free version. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php