Search Postgresql Archives

Re: [Bacula-users] Catastrophic changes to PostgreSQL 8.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/3/2009 3:33 AM, Craig Ringer wrote:
Kern Sibbald wrote:
Hello,

Thanks for all the answers; I am a bit overwhelmed by the number, so I am
going to try to answer everyone in one email.

The first thing to understand is that it is *impossible* to know what the
encoding is on the client machine (FD -- or File daemon).  On say a

Or, even worse, which encoding the user or application was thinking of when it wrote a particular out. There's no guarantee that any two files on a system were intended to be looked at with the same encoding.

Unix/Linux system, the user could create filenames with non-UTF-8 then switch
to UTF-8, or restore files that were tarred on Windows or on Mac, or simply
copy a Mac directory.  Finally, using system calls to create a file, you can
put *any* character into a filename.

While true in theory, in practice it's pretty unusual to have filenames
encoded with an encoding other than the system LC_CTYPE on a modern
UNIX/Linux/BSD machine.

Unless, of course, you're at a good sized school with lots of international students, and have fileservers holding filenames created on desktops running in Chinese, Turkish, Russian, and other locales.

In the end, a filename is (under linux, at least) just a string of arbitrary bytes containing anything except / and NULL. If bacula tries to get too clever, and munges or misinterprets those bytes strings - or, worse yet, if the database does it behind your back - then stuff _will_ end up breaking.

(A few years back, someone heavily involved in linux kernel filesystem work was talking about this exact issue, and made the remark that many doing internationalization work secretly feel it would be easier to just teach everyone english. Impossible as this may be, I have since come to understand what they were talking about...)

--
Frank Sweetser fs at wpi.edu  |  For every problem, there is a solution that
WPI Senior Network Engineer   |  is simple, elegant, and wrong. - HL Mencken
     GPG fingerprint = 6174 1257 129E 0D21 D8D4  E8A3 8E39 29E3 E2E8 8CEC

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux