trouble mounting FAT filesystem with japanese files on it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a linux-2.4.18 embedded system that is configured to use UTF-8
as its default codepage.  It also has CP932 (SJIS) compiled into it.
I am able to mount SMB shares that have Japanese files, and I can "ls"
directories with those files in them without any problems (the
Japanese filenames get printed just fine).  So I know that the kernel
and the terminal are able to pass UTF8 Japanese characters.

The problem is that when I mount a FAT filesystem that includes files
with Japanese filenames, I am unable to get "ls" to print out the UTF8
translation of those filenames.  I expect to see this:

  # ls /tmp/cf
  カタカナ  ascii            atend日本語     日本語
  ひらがな  ascii-long-name  in日本語middle  日本語atstart
  日本語とても長いファイル名
  # 

(The above was captured on a laptop running FC Core 2 (linux-2.6.5),
and the FAT filesystem was on a compact flash mounted without any
options.)

...but on linux-2.4.18 what I actually see is this:

  # ls /tmp/cf
  ______~1  ____~2    ___~1     ascii-~1  in___m~1
  ____~1    ___ats~1  ascii     atend_~1
  #

This is the same regardless of what codepage or iocharset I specify:

  # mount linux-cf-test.img /tmp/cf -t msdos -o loop,codepage=437
  # ls /tmp/cf
  ______~1  ____~2    ___~1     ascii-~1  in___m~1
  ____~1    ___ats~1  ascii     atend_~1
  # umount /tmp/cf
  # mount linux-cf-test.img /tmp/cf -t msdos -o loop,codepage=932
  # ls /tmp/cf
  ______~1  ____~2    ___~1     ascii-~1  in___m~1
  ____~1    ___ats~1  ascii     atend_~1
  # umount /tmp/cf
  # mount linux-cf-test.img /tmp/cf -t msdos -o loop,codepage=932,iocharset=utf8
  # ls /tmp/cf
  ______~1  ____~2    ___~1     ascii-~1  in___m~1
  ____~1    ___ats~1  ascii     atend_~1
  #

I have tried mounting a CompactFlash card (on the IDE bus), as well as
mounting an image of that CompactFlash card over NFS and SMBFS, and
all behave the exact same way.

It appears that linux doesn't even attempt to read the long version of
the CF filenames, because when I "strace ls", I see that the file
names returned from getdents64() are the abbreviated versions:

  open("/tmp/cf", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY)           = 3
  fstat64(3, {st_mode=S_IFDIR|0755, st_size=16384, ...})                 = 0
  fcntl64(3, F_SETFD, FD_CLOEXEC)                                        = 0
  getdents64(3, /* 11 entries */, 2048)                                  = 336
  lstat64("/tmp/cf/ascii", {st_mode=S_IFREG|0755, st_size=20, ...})      = 0
  lstat64("/tmp/cf/ascii-~1", {st_mode=S_IFREG|0755, st_size=24, ...})   = 0
  lstat64("/tmp/cf/___~1", {st_mode=S_IFREG|0755, st_size=24, ...})      = 0
  lstat64("/tmp/cf/______~1", {st_mode=S_IFREG|0755, st_size=42, ...})   = 0
  lstat64("/tmp/cf/___ats~1", {st_mode=S_IFREG|0755, st_size=33, ...})   = 0
  lstat64("/tmp/cf/in___m~1", {st_mode=S_IFREG|0755, st_size=34, ...})   = 0
  lstat64("/tmp/cf/atend_~1", {st_mode=S_IFREG|0755, st_size=31, ...})   = 0
  lstat64("/tmp/cf/____~1", {st_mode=S_IFREG|0755, st_size=27, ...})     = 0
  lstat64("/tmp/cf/____~2", {st_mode=S_IFREG|0755, st_size=27, ...})     = 0
  getdents64(3, /* 0 entries */, 2048)                                   = 0
  close(3)                                                               = 0

If getdents64() had returned the long version of those filenames,
calls to lstat64() would not have used the short version, right?

Any suggestions on how to get linux-2.4.18 to properly ls FAT
filesystems with Japanese on them?

Thanks,
Dave


--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux