Re: ISO-8859-1 to UTF-8 conversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2002-10-20 at 04:25, Karsten Weiss wrote:
> 
> I would like to know how you are currently handling the
> conversion of you systems to UTF-8. Please share your
> experience!

I've only had a few filenames to convert myself.  It was fairly simple. 
Havoc posted a script to do it   Interesting notes on the subject
include:
https://listman.redhat.com/pipermail/psyche-list/2002-October/001965.html
https://listman.redhat.com/pipermail/psyche-list/2002-October/001027.html

> * Is there a program similar which can determine the
>   character set of a given text file? I know there is iconv
>   to convert character sets of text files. But I still
>   don't know a program which tells me if a given file is
>   encoded in ISO-8859-1, ISO-8869-15, a Windows code page,
>   etc.

Uh... not really.  That's what's wrong with locale-specific encodings:
nothing about the file itself really indicates what encoding was used. 
The representation of the bytes changes based on the locale you're using
when you edit the file.

> * Which program are you using to convert you're ISO-8859-1
>   file systems (directory- and filenames - not the file
>   contents!) to UTF-8?

See the second message linked above.

> * I'm not sure how I am supposed to handle all my text files.
>   All of them are using ISO-8859-1 right now. But now that
>   I'm using Red Hat 8 I'm never sure if the text editor
>   saves them in ISO-8859-1 or UTF-8.

All but the really basic editors should do the Right Thing and save in
UTF-8 text.  If you want to check, use "od -c" and see what bytes are
used where you have non-ascii characters in your files.

> * What about other non-UTF-8-aware machines accessing my
>   files? File "formats" without a text encoding tag are
>   becoming really problematic now, aren't they?

Anything worth using understands UTF-8, AFAIK.

> * How to convert ID3 tags?

Good question;  I've been working on the same thing.  I'm testing a
script that a friend of mine's been working on... nothing I'd really
recommend at the moment.  Most MP3 software just happily writes
ISO-8859-1 characters.  OGG Vorbis files don't seem to have the same
problem.  I think 'vorbiscomment' converts the encodings properly.

> PS: The non-working umlauts in pine with a WONTFIX bug status
>   is a major problem for me.

Use a better mailer (doesn't mutt support UTF-8?) or request that the
pine maintainers fix it.  The pine license sorta sucks, and I think it
prevents Red Hat from fixing problems themselves.  The same sort of
thing is the reason that Red Hat doesn't ship qmail... :-/





-- 
Psyche-list mailing list
Psyche-list@redhat.com
https://listman.redhat.com/mailman/listinfo/psyche-list

[Index of Archives]     [Fedora General Discussion]     [Red Hat General Discussion]     [Centos]     [Kernel]     [Red Hat Install]     [Red Hat Watch]     [Red Hat Development]     [Red Hat 9]     [Gimp]     [Yosemite News]

  Powered by Linux