Re: [RFC PATCH] Windows: Assume all file names to be UTF-8 encoded.

Peter Krefting <peter@xxxxxxxxxxxxxxxx> · Mon, 02 Mar 2009 14:57:52 +0100 (CET)

Hi!

Makes sense too. I think the whole API would have to be changed to use 
TCHAR*.

I'd rather just say wchar_t explicitely. I'm not particularly fond of macros 
that change under your feet just because you fail to define a symbol 
somewhere...

Then you need to do the right conversion at the right places, this will be 
quite tricky, painful work, but there is probably no way around that.

In the other project I worked on we ended up wrapping all file-related calls 
in our own porting interface, and then let each platform we compiled for 
implement their own methods for handling Unicode paths. For Windows it's 
trivial since all APIs are Unicode. For Unix-like OSes it's tricky as you 
have to take the locale settings into account, but fortunately the world is 
slowly moving towards UTF-8 locales, which eases the pain a bit.

Note that not only conversions will be needed but you'll also need to 
adjust all routines handling filenames to use the proper Unicode version. 
(strchr -> _tstrchr, open -> _topen, strcpy -> _tstrcpy, strlen -> 
_tcslen, ...).

Not necessarily. If the code can be set up to use UTF-8 char* internally, 
not everything needs to be rewritten (I've done that too, only took a 
couple of years to move the codebase over to all-Unicode).

--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html