Re: Eliminatimg PHP UTF-8 BOM in a returned stream to a Mobile App

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12 April 2011 13:29, Eli Orr <eli.orr@xxxxxxxxxxxx> wrote:
> Hi Richard,
>
> Thanks.
> I've already got a solution to simply use Notes++ and save the PHP script
> with Save As encoding set to ANSI (It was UTF-8 indeed that creates the BOM...).
>
> Thanks again
>
> Eli
>
> -----Original Message-----
> From: Richard Quadling [mailto:rquadling@xxxxxxxxx]
> Sent: Tuesday, April 12, 2011 2:59 PM
> To: Eli Orr
> Cc: php-general@xxxxxxxxxxxxx
> Subject: Re:  Eliminatimg PHP UTF-8 BOM in a returned stream to a Mobile App
>
> On 12 April 2011 12:50, Richard Quadling <rquadling@xxxxxxxxx> wrote:
>> On 12 April 2011 11:59, Eli Orr <eli.orr@xxxxxxxxxxxx> wrote:
>>>
>>> Hi Richard,
>>>
>>> Thanks.
>>> Indeed, that is the case - I've included a code that has UTF-8 string
>>> contants -so I guess the PHP parser set the UTF-8 mode to ON so that the returned string to the client has the UTF-8 BOM.
>>>
>>> It is not a big issue as the mobile app guys aware of this and make the proper 3 bytes offset.
>>> Anyhow I was looking for a service to control that behaviour.
>>>
>>> ÂThanks
>>>
>>> Eli
>>>
>>> -----Original Message-----
>>> From: Richard Quadling [mailto:rquadling@xxxxxxxxx]
>>> Sent: Tuesday, April 12, 2011 12:45 PM
>>> To: Eli Orr
>>> Cc: php-general@xxxxxxxxxxxxx
>>> Subject: Re:  Eliminatimg PHP UTF-8 BOM in a returned stream to
>>> a Mobile App
>>>
>>> 2011/4/12 Eli Orr <eli.orr@xxxxxxxxxxxx>:
>>>> Dear PHP Gurus,
>>>>
>>>> I would like to Eliminate the 3 UTF-8 BOM enforced on my returned BLOB:
>>>>
>>>> The PHP server adds Âutf-8 BOM (UTF-8 Byte Order Mark Â- in the
>>>> beginning of
>>>> UTF-8 Âfiles) which
>>>> consists of three bytes: EF BB BF.
>>>>
>>>> The Mobile App served by the server Does not need that. How can I
>>>> eliminate it??
>>>>
>>>> Thanks.
>>>>
>>>> UTF-8 Byte Order Mark â BOM:
>>>> http://unicode.org/faq/utf_bom.html#BOM
>>>>
>>>> Best Regards,
>>>>
>>>> Eli ÂOrr
>>>> CTO & Founder
>>>> Mimmage.com
>>>> My virtual vCard
>>>> LogoDial Ltd.
>>>> M:+972-54-7379604
>>>> O:+972-74-703-2034
>>>> F: +972-77-3379604
>>>>
>>>> Plaut 10, Rehovot, Israel
>>>> Email: Â Eli.Orr@xxxxxxxxxxxx
>>>> Skype: Âeliorr.com
>>>>
>>>>
>>>> -----
>>>>
>>>>
>>>>
>>>> --
>>>> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
>>>> http://www.php.net/unsub.php
>>>>
>>>>
>>>
>>> Can you show us the PHP script that DOES output the BOM?
>>>
>>> Normally, PHP doesn't do this automatically (AFAIK). The main reason being is that it is often the case that the BOM appears in the source code file before the <?php opening tag, so would block headers (session cookie for example).
>>>
>>> If a BOM is being issued by PHP, it is being done programmatically, or is being missed due to the initial source code file having the BOM set.
>>>
>>> See http://docs.php.net/manual/en/function.session-start.php#102431,
>>> http://docs.php.net/manual/en/function.header.php#95864, etc.
>>>
>>> Now. Having said all of that, you may find you are using some sort of output buffering and that is setting the BOM after the headers are sent.
>>>
>>> But, as it stands, PHP will not be generating the BOM for you.
>>>
>>> --
>>> Richard Quadling
>>> Twitter : EE : Zend
>>> @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY
>>>
>>> --
>>> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
>>> http://www.php.net/unsub.php
>>>
>>>
>>>
>>
>> No. The parser does not _ADD_ the BOM.
>>
>> The bom already exists in your source code. Nothing to do with PHP.
>>
>> The file you included that has the UTF-8 constants has the BOM.
>>
>> You need to edit that file and remove the BOM. The actions you need to
>> take will depend upon your editor.
>>
>> --
>> Richard Quadling
>> Twitter : EE : Zend
>> @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY
>>
>
> To be a bit more specific...
>
> The BOM is the 3 bytes you correctly identified earlier.
>
> Most editors won't show these when you edit the files.
>
> But, for the sake of argument, let's just pretend they are visible and look like ...
>
> #@&
>
> In the php script that contains some UTF-8 constants, the file would look like ...
>
> #@&<php
> echo 'ï'; // The Fullwidth Won sign.
> ?>
>
> As PHP will only actually parse the content between <?php and ?>, the #@& string (the BOM) is simply sent straight through to the web server
> -> the client with no interruption.
>
> Now, if your code was ...
>
> #@&<php
> session_start();
> ?>
>
> you would see the headers already sent error message, as the BOM tells the webserver that data is now being received and to send any headers it already has.
>
> So when the session_start() wants to send the session cookie (which is done as a HTTP Header), PHP already knows some content has gone (the
> BOM) and reports the error.
>
> To iterate, PHP is NOT generating the BOM. You already did that in your code. Well, the editor did it for you.
>
> Ideally, you want to turn off the BOM in your editor.
>
> --
> Richard Quadling
> Twitter : EE : Zend
> @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY
>
> --
> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
>
>
>

Hmmm. You really want to save the code as UTF-8 without BOM.

What is the editor? The references for Notes++ I can find is for a
post-it notes app.






-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux