Detecting The Encoding Of A Text File

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have been trying for the last couple of hours to determine the
encoding of a text file (.txt in windowz).

I have this code:

        $contents = file_get_contents($config['txt_dir'] . $file);
        $encoding = mb_detect_encoding($contents,
"UTF-8,ISO-8859-1,WINDOWS-1252"); //,Windows-1255

        echo "||encoding:".$encoding."||";

        if ($encoding == 'UTF-8')
        {
            $utfcontents = $contents;
        }
        else if ($encoding == 'ISO-8859-1')
        {
            $utfcontents = utf8_encode($contents);
        }

        var_dump($utfcontents);

The $encoding is ISO-8859-1, the text file contains Hebrew characters, then
I'm converting it to utf8.

The above code is outputing gibbrish, it seems that it has converted it in
some way but not in the
proper way that it should have converted it.

My page is UTF-8 encoded, without BOM, I send UTF-8 headers to the browser
and HTML content
encoding meta tag as well.

I have no idea what I am doing wrong.

I would highly appreciate it if someone could point me to the right
direction.

Thanks in Advance,

Nitsan

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux