Re: [Mingw-w64-public] toUpper()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In an earlier and a different posting, I express my concern about MS-Windows' std::string/locales, what is the point of wstring, I asked, if at the end there is no reliable uft-8/16/32 support for. them. The best thing to do is to use boost's or some other 3rd party, if true utf support is needed.
towupper is no different than toupper when it comes to letters like á or ñ.
I you know if a way to handle them, specially the letter LL and its lowercase counterpart when arranged in SIAO list (sort-in-alphabetical-order) please do let me know.

Thanks in advance.

-----Original Message----- From: Martin Sebor
Sent: Wednesday, July 1, 2015 11:17 AM
To: papa@xxxxxxxxxxx ; Riot ; mingw-w64-public@xxxxxxxxxxxxxxxxxxxxx
Cc: gcc-help Mailing List
Subject: Re: [Mingw-w64-public] toUpper()

On 07/01/2015 06:02 AM, papa@xxxxxxxxxxx wrote:
std::wstring source(L"Hello World");
std::wstring destination;
destination.resize(source.size());
std::transform (source.begin(), source.end(), destination.begin(),
(int(*)(int))std::toupper);

The above code is what did the trick, do not ask how, I am still
digesting it. However, any suggestions would be very much appreciated

This solved problem (1) below but doesn't work correctly or
portably because of the second problem I described in my first
response. std::toupper(int) is defined for narrow characters in
the range [0, UCHAR_MAX] plus EOF. The function has undefined
behavior for characters outside that range (i.e., all wchar_t
greater than UCHAR_MAX).

I don't know what will happen on Windows(*) but on Linux, I can
see the program doesn't work correctly for the Latin Extended
Additional block of characters (the first one I noticed). For
instance, running the attached modified version of the program
in a UTF-8 locale such as en_US.utf8 to convert U+1EBD (LATIN
SMALL LETTER E WITH TILDE) to its uppercase form (U+1EBC)
prints:

    U+1EBD  U+1EBC  U+1EBD

when the expected output is:

    U+1EBD  U+1EBC  U+1EBC

If you want to use transform with wide characters, you need
to use towupper (declared in <wctype.h>).

Martin

[*] I vaguely recall toupper and friends aborting on Windows
when passed an out-of-range argument but I'm not 100% sure.


-----Original Message----- From: Martin Sebor
Sent: Tuesday, June 30, 2015 10:01 PM
To: Riot ; mingw-w64-public@xxxxxxxxxxxxxxxxxxxxx
Cc: gcc-help Mailing List
Subject: Re: [Mingw-w64-public] toUpper()

On 06/30/2015 05:24 PM, Riot wrote:
     #include <algorithm>
     #include <string>

     std::string str = "Hello World";
     std::transform(str.begin(), str.end(), str.begin(), std::toupper);

Please note this code is subtly incorrect for two reasons.
There are two overloads of std::toupper:

1) int toupper(int) declared in <ctype.h> (and the equivalent
    std::toupper in <cctype>)
2) template <class T> charT std::toupper(T, const locale&)
    in <locale>

Without the right #include directive, the above may or may
not resolve to "the right" function (which depends on what
declarations the two headers bring into scope).

When it resolves to (2) it will fail to compile.

When it resolves to (1), it will do the wrong thing (have
undefined behavior) at runtime when char is a signed type
and the argument is negative (because (1) is only defined
for values between -1 and UCHAR_MAX).

But the question is about converting std::wstring to upper
case and the above uses a narrow string. For wstring, the
std::ctype<wchar_t>::toupper() function or its convenience
non-member template function can be used.

See also: http://www.cplusplus.com/reference/locale/toupper/

This is one possible way to do it. Another approach is along
these lines:

    std::locale loc (...);
    std::wstring wstr = L"...";
    const std::ctype<wchar_t> &ct =
        std::use_facet<std::ctype<wchar_t> >(loc);
    ct.toupper (&wstr[0], &wstr[0] + wstr.size());

Martin


This may also help in future: http://lmgtfy.com/?q=c%2B%2B+toupper

-Riot

On 30 June 2015 at 23:58,  <papa@xxxxxxxxxxx> wrote:
I would like to write a function to capitalize letters, say...
std::wstring toUpper(const std::wstring wstr){
for ( auto it = wstr.begin(); it != wstr.end(); ++it){
         global_wapstr.append(std::towupper(&it));

}
}

This doesn’t work, but doesn’t the standard already have something like
std::wstring::toUpper(...)?

Thanks in advance


---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com


------------------------------------------------------------------------------

Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com



---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux