Re: Hebrew Directory Names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/28/09 10:15 AM, "Nitsan Bin-Nun" <nitsan@xxxxxxxxxxxx> wrote:

> I have wrote a files-based php system which not requires any kind of
> database to work, it is based upon files and directories.
> 
> I'm using scandir() to fetch the file names of a directory, when the files
> and the directories are in English everything works like a charm, the
> problems starting when the files or the directories are in Hebrew.
> 
> I tried several encodings for the directory names (including UTF8) but no
> luck so far. I always get is_file() === FALSE when the directory name is in
> Hebrew and the filename is in English.
> 
> Any ideas??
> 
> The path which I'm checking with is_file() is:
> 
> string(63) "/home/nitsanbn/public_html/iphoneia/walls/מחשבים/1346.jpg"
> 
> 
> (Due to the LTR encoding of this messege the filename is looking like it's
> in Hebrew, the directory is מחשבים and the filename is 1346.jpg)

interesting problem. i suspect there will be serious dependency on os and
file system.

on os x 10.5 and a Mac OS Extended volume i created a directory with 5 files
in it as follows:

$ ls במהירות
את        עסקת        שליט        רוצים        להשלים

(i've no idea what these words mean. i copied them from a hebrew newspaped
web site.)

i ran the following script with cwd set to the ditectory in php 5.2.8 cli:

$default_locale = setlocale(LC_ALL, 'en_US.UTF-8');
ini_set('default_charset', 'UTF-8' );
$dh  = opendir('.');
while (false !== ($filename = readdir($dh)))
    $files[] = $filename;
sort($files);
print_r($files);
foreach ($files as $f)
    print( (is_file($f) ? "file:\t\t" : "not file:\t").realpath($f)."\n" );


Array
(
    [0] => .
    [1] => ..
    [2] => את
    [3] => להשלים
    [4] => עסקת
    [5] => רוצים
    [6] => שליט
)
not file:    /Users/fsb/במהירות
not file:    /Users/fsb
file:        /Users/fsb/במהירות/את
file:        /Users/fsb/במהירות/להשלים
file:        /Users/fsb/במהירות/עסקת
file:        /Users/fsb/במהירות/רוצים
file:        /Users/fsb/במהירות/שליט

ok. but as you noted, the ltr reversal switches path order as well as
character order, which is a bit confusing. (btw: the square brackets above
got reversed in the cut and paste from terminal window to entourage.)

then i renamed the files thus:

$ ls 
את.jpg            שליט.jpg        להשלים.jpg
עסקת.jpg        רוצים.jpg

$ ls -l
total 0
-rw-r--r--  1 fsb  fsb  0 May 28 12:16 את.jpg
-rw-r--r--  1 fsb  fsb  0 May 28 12:16 עסקת.jpg
-rw-r--r--  1 fsb  fsb  0 May 28 12:59 שליט.jpg
-rw-r--r--  1 fsb  fsb  0 May 28 12:16 רוצים.jpg
-rw-r--r--  1 fsb  fsb  0 May 28 12:16 להשלים.jpg

which is also a tad confusing, ls using two different conventions. but the
script still seems to work.

Array
(
    [0] => .
    [1] => ..
    [2] => את.jpg
    [3] => להשלים.jpg
    [4] => עסקת.jpg
    [5] => רוצים.jpg
    [6] => שליט.jpg
)
not file:    /Users/fsb/במהירות
not file:    /Users/fsb
file:        /Users/fsb/במהירות/את.jpg
file:        /Users/fsb/במהירות/להשלים.jpg
file:        /Users/fsb/במהירות/עסקת.jpg
file:        /Users/fsb/במהירות/רוצים.jpg
file:        /Users/fsb/במהירות/שליט.jpg

now with a couple of numeric file names in the same directory:

$ ls
1234.jpg        עסקת.jpg        להשלים.jpg
2345.jpg        שליט.jpg
את.jpg            רוצים.jpg

Array
(
    [0] => .
    [1] => ..
    [2] => 1234.jpg
    [3] => 2345.jpg
    [4] => את.jpg
    [5] => להשלים.jpg
    [6] => עסקת.jpg
    [7] => רוצים.jpg
    [8] => שליט.jpg
)
not file:    /Users/fsb/במהירות
not file:    /Users/fsb
file:        /Users/fsb/במהירות/1234.jpg
file:        /Users/fsb/במהירות/2345.jpg
file:        /Users/fsb/במהירות/את.jpg
file:        /Users/fsb/במהירות/להשלים.jpg
file:        /Users/fsb/במהירות/עסקת.jpg
file:        /Users/fsb/במהירות/רוצים.jpg
file:        /Users/fsb/במהירות/שליט.jpg

so it looks like things are working here.

i was unable to do anything hebrew at all on

in your position i would check the default charset and locale settings. also
perhaps try a different php version. i expect you have mbstring installed. i
have 

mbstring.func_overload = 7
mbstring.internal_encoding = UTF-8

but i doubt that makes the difference.



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux