Re: Unicode string

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jun 26, 2011, at 2:53 PM, eric wrote:

> On Sun, 2011-06-26 at 16:37 +0100, Jonathan Wakely wrote:
>> On 26 June 2011 16:09, eric wrote:
>>> On Sun, 2011-06-26 at 16:52 +0200, Axel Freyn wrote:
>>>> Hi Eric,
>>>> 
>>>> On Sun, Jun 26, 2011 at 07:15:54AM -0700, eric wrote:
>>>>> Dear c/g++ advanced programers:
>>>>> I copied and tried to test a piece simple code which is used for
>>>>> (Hardcoding a Unicode String) from the book (
>>>>> C++ Cookbook) by D. Ryan Stephens, Christopher Diggins, Jonathan
>>>>> Turkanis, and Jeff Cogswell
>>>>> at Chapter 13, Internationalization, section 1: Hardcoding a Unicode
>>>>> String
>>>>> example 13-1, it can compile and run on my g++4.5.2, but I don't quite
>>>>> satisfy its result
>>>>> 
>>>>> --------------------
>>>>> //Example 13-1 Hardcoding a Unicode string
>>>>> #include <iostream>
>>>>> #include <fstream>
>>>>> #include <string>
>>>>> 
>>>>> using namespace std;
>>>>> 
>>>>> int main() {
>>>>> 
>>>>> // Create some strings with Unicode characters
>>>>> wstring ws1 = L"Infinity: \u221E";
>>>>> wstring ws2 = L"Euro: \u0128";
>>>>> 
>>>>> wchar_t w[] = L"Infinity: \u221E";
>>>>> 
>>>>> wofstream out("tmp\\unicode.txt");
>>>>> out << ws2 << endl;
>>>>> wcout << ws2 << endl;
>>>>> }
>>>> As far as I know, you should absolutely NOT use non-ascii characters in
>>>> input/output operations without explicitely specifying the
>>>> encoding/localization to be used.
>>>> In your example, I would thus propose to add after the "wofstream
>>>> out..." a line like
>>>> out.imbue(locale("de_DE.UTF-8"));
>>>> which defines the encoding.
>>>> The following works for me:
>>>> 
>>>> #include <iostream>
>>>> #include <fstream>
>>>> #include <string>
>>>> using namespace std;
>>>> int main() {
>>>> wstring ws2 = L"Euro:\x20ac";
>>>> wofstream out("unicode.txt");
>>>> out.imbue(locale("de_DE.UTF-8"));
>>>> out << ws2<< endl;
>>>> }
>>>> 
>>>> (besides, the Euro-symbol in Unicode is \x20ac)
>>>> 
>>>> 
>>>> In addition, you SHOULD add error-checking to your code. If you add
>>>>  if(not out.good())
>>>>    cerr << "Error while writing " << endl;
>>>> AFTER the line which writes into the file, you'll get an error message ....
>>>> 
>>>> HTH,
>>>> 
>>>> Axel
>>>> 
>>> ---------------------------
>>> Dear Axel:
>>> Thanks your reply.
>>> I copied your code and tried on my system.  It compile and run, but
>>> reponse by
>>> ---------------------------
>>> root@eric-laptop:/home/eric/cppcookbook# ./a.out
>>> terminate called after throwing an instance of 'std::runtime_error'
>>> what():  locale::facet::_S_create_c_locale name not valid
>>> Aborted
>>> --------
>>> Do you konw what can be improved more?
>> 
>> You probably don't have the de_DE.UTF-8 locale installed.  Try running
>> 'locale -a' to see which locales you have installed and pick a utf8
>> one, e.g. en_US.utf8
>> 
>>> looking to see your(or any advancer's) suggestion again, thanks a lot
>>> in advance, Eric
>> 
>> That word still doesn't mean anything.
> ------------------------------------------------------------------
> Dear g++ programers:
> 
>  I already check my locale and modify my code to include en_US.utf8 ,
> however same kine error appear.  What may cause wrong.  
> looking to see any experienced suggestion and thanks a lot in advance
> Eric
> -----------------------------------------------------------------
> root@eric-laptop:/home/eric/cppcookbook# ./a.out
> terminate called after throwing an instance of 'std::runtime_error'
>  what():  locale::facet::_S_create_c_locale name not valid
> Aborted
> root@eric-laptop:/home/eric/cppcookbook# cat exam13-1-2.cpp
> #include <iostream>
> #include <fstream>
> #include <string>
> using namespace std;
> int main() {
> wstring ws2 = L"Euro:\x20ac";
> wofstream out("unicode.txt");
> out.imbue(locale("en_US.utf8"));
> out << ws2<< endl;
> }
> 
> 
> root@eric-laptop:/home/eric/cppcookbook# locale -a
> C
> en_AG
> en_AU.utf8
> en_BW.utf8
> en_CA.utf8
> en_DK.utf8
> en_GB.utf8
> en_HK.utf8
> en_IE.utf8
> en_IN
> en_NG
> en_NZ.utf8
> en_PH.utf8
> en_SG.utf8
> en_US.utf8
> en_ZA.utf8
> en_ZW.utf8
> POSIX
> ------------------------
> 


Eric, may I ask what OS you are using?  After I read your postings, I tried the same code on my Mac (OS X 10.6), and encountered exactly the same problem.  Indeed, I do have en_US.UTF8 installed (on my system, all the suffixes are capitalized), and I have
out.imbue(locale("en_US.UTF8"));
My runtime throws the same exception as the one you are getting.  I did a little bit of looking on the Web.  This problem seems to have come up before, e.g., in this bug report:

https://trac.macports.org/ticket/15653

However, that report is three years old!  I have not seen any recent explanations of a suitable workaround.

Amittai Aviram




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux