Re: Check the byte sequence of a file to tell if it is UTF-8 without the BOM using PHP ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22 May 2011 08:17, Eli Orr (Office) <eli.orr@xxxxxxxxxxxx> wrote:
> Hi Adam,
>
> I have a prof that the XML advise does not work in real cases I had.
> We are using XMLs in our system but when you edit the XML with Âa text
> editor and put the XML heading of UTF-8
> <?xml version="1.0" encoding="UTF-8"?>
>
> it DOES NOT assure the text inside is encoded in UTF-8 so but maybe (many
> cases) t other iso-xxx method.

The point of the header is telling readers what encoding is used. Of
course that means errors are possible - setting the header is not
magic, it doesn't change the rest of the file. You need to make sure
the contents of the file match the encoding from the header when you
make XML documents.

Anyway, from your perspective, the header is an indication but not a
foolproof way of figuring encoding out.

> My question was for a function that scan the bytes of the file and decided
> WITHOUT the BOM heading.
> I mean by checking the bytes sequence in the file.
>
> I claim that WITHOUT a BOM it might be impossible to assure it is UTF-8
> encoding which is a whole escape sequence logic
> that may convert one character into one, two or three character.

http://se.php.net/manual/en/function.mb-detect-encoding.php - the
first comment should be interesting to you.

*****
If you try to use mb_detect_encoding to detect whether a string is
valid UTF-8, use the strict mode, it is pretty worthless otherwise.

<?php
    $str = 'ÃÃÃÃ'; // ISO-8859-1
    mb_detect_encoding($str, 'UTF-8'); // 'UTF-8'
    mb_detect_encoding($str, 'UTF-8', true); // false
?>
****

Regards
Peter

-- 
<hype>
WWW: plphp.dk / plind.dk
LinkedIn: plind
BeWelcome/Couchsurfing: Fake51
Twitter: kafe15
</hype>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux