Re: Check the byte sequence of a file to tell if it is UTF-8 without the BOM using PHP ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Peter,

But my point was different.

If you DO NOT have any BOM of a File does

mb_detect_encodin


can detect the file type by scanning the whole file ??

Thanks

Eli

On 22/05/2011 09:53, Peter Lind wrote:
On 22 May 2011 08:17, Eli Orr (Office)<eli.orr@xxxxxxxxxxxx>  wrote:
Hi Adam,

I have a prof that the XML advise does not work in real cases I had.
We are using XMLs in our system but when you edit the XML with  a text
editor and put the XML heading of UTF-8
<?xml version="1.0" encoding="UTF-8"?>

it DOES NOT assure the text inside is encoded in UTF-8 so but maybe (many
cases) t other iso-xxx method.
The point of the header is telling readers what encoding is used. Of
course that means errors are possible - setting the header is not
magic, it doesn't change the rest of the file. You need to make sure
the contents of the file match the encoding from the header when you
make XML documents.

Anyway, from your perspective, the header is an indication but not a
foolproof way of figuring encoding out.

My question was for a function that scan the bytes of the file and decided
WITHOUT the BOM heading.
I mean by checking the bytes sequence in the file.

I claim that WITHOUT a BOM it might be impossible to assure it is UTF-8
encoding which is a whole escape sequence logic
that may convert one character into one, two or three character.
http://se.php.net/manual/en/function.mb-detect-encoding.php - the
first comment should be interesting to you.

*****
If you try to use mb_detect_encoding to detect whether a string is
valid UTF-8, use the strict mode, it is pretty worthless otherwise.

<?php
     $str = 'ÃÃÃÃ'; // ISO-8859-1
     mb_detect_encoding($str, 'UTF-8'); // 'UTF-8'
     mb_detect_encoding($str, 'UTF-8', true); // false
?>
****

Regards
Peter



--
Best Regards,

*Eli Orr*
CTO & Founder
*LogoDial Ltd.*
M:+972-54-7379604
O:+972-74-703-2034
F: +972-77-3379604

Plaut 10, Rehovot, Israel
Email: _Eli.Orr@xxxxxxxxxxxxx
Skype: _eliorr.com_

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux