UTF-8 vs. current locale charset mess...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



GTK+ 1.3 (and 2.0) uses UTF-8 internally, while the file system
related C runtime calls like stat(), open() and opendir() uses a
"current codepage" (the Windows term, on Unix you want to use whatever
encoding/charset the user's locale uses).

This causes some problems in GIMP. The code in fileops.c and
otherwhere doesn't take into consideration that strings obtained from
the user and strings that are passed to GTK+ are in in UTF-8, while
pathnames used in stat()/open() etc should be in the current
locale charset.

In GTK+ 1.3 currently the gtk_file_selection_get_filename() function
returns a string in the current locale charset, not UTF-8. However, if
it leads to cleaner code, this can be changed GTK+ 1.3 is a developer
version, and really used for "production" only on Windows, mainly for
GIMP. (What other apps might use it on Windows probably don't even try
to be i18n-correct anyway.)

GLib 1.3 has the functions g_filename_from_utf8() and
g_filename_to_utf8() to convert back and forth. They are not carved
into stone yet, either.

But... What do you think. Is it OK at this late stage to add code to
GIMP to do conversions back and forth to UTF-8? It would obviously be
done in such a way that everything would work as before on GTK+ 1.2.x
and Unix. There would just be some extra g_strdup()ing and
g_free()ing. The code would get a bit more verbose, like this:

#if defined (GTK_CHECK_VERSION) && GTK_CHECK_VERSION (1,3,1)
#define FILENAME_TO_UTF8(fn) g_filename_to_utf8 (fn)
#define FILENAME_FROM_UTF8(fn) g_filename_from_utf8 (fn)
#else
#define FILENAME_TO_UTF8(fn) g_strdup (fn)
#define FILENAME_FROM_UTF8(fn) g_strdup (fn)
#endif

...

  else if (status != PDB_CANCEL)
    {
      temp_filename = FILENAME_TO_UTF8 (filename);
      g_message (_("Open failed.\n%s"), temp_filename);
      g_free (temp_filename);
    }

...

		temp_filename = FILENAME_FROM_UTF8 (mfilename);
		err = stat (temp_filename, &buf);
		g_free (temp_filename);

Etc. One would have to precisely indicate what string
variables/parameters are in UTF-8 and which are in the current locale
charset, and make sure these invariants hold. Or should I just ignore
this issue, and let it wait until GIMP 1.[34], which I assume will be
targetted to work with GTK+ 2.0?

Come to think of it, maybe it would be a good idea to introduce an
opaque type to GLib 1.3 "GSystemCharsetString" or something, to be
used for all strings that are in the locale-dependent charset. This
type would actually be just gchar[], but the compiler wouldn't know
that, and thus it would be easier to keep count of what strings are in
what encoding. Have to think more on this.

--tml



[Index of Archives]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [GIMP for Windows]     [KDE]     [GEGL]     [Gimp's Home]     [Gimp on GUI]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux