Re: Help needed with mb_convert_encoding()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 28, 2014 at 06:56:48AM -0300, Flavio Kenji Yanai wrote:
> I don't test it ...
> 
> $utf8_str = utf8_decode($original_str);
> 
> if (!substr_cmp($utf8_str,$original_str,length($original_str)){
>     echo "equal, valid utf8";
> }
> else {
>     echo "not equal , non valid utf8 input";
> }

Sorry, maybe I did not explain myself well enough. I want to be able to provide
feedback to the user to say where something is wrong, so it would be nice to
say, with the example that I gave, something like:

    Bad input detected, invalid character(s) replaced by '?':

    a bad angle bracket ? here

I suppose that you could take the point of view that bad character encoding is a
result of someone trying to break the PHP script & so you do not need to be
nice. But maybe it is as a result of an innocent error somewhere.

With a bit of work I can find the first difference & replace by '?', but as far
as I can see mb_convert_encoding() should make it easy.

> 2014-05-28 6:03 GMT-03:00 Alain Williams <addw@xxxxxxxxxxxx>:
> 
> > I am trying to use this to validate input that is supposed to be UTF-8 and
> > to
> > replace any bad characters with something - '?' would do.
> >
> > I have the test program below. No matter what I try to give as an argument
> > to
> > mb_substitute_character() it always removes the bad input sequence, I
> > would like
> > to replace it.
> >
> > Thanks in advance
> >
> >     <?php
> >     mb_internal_encoding("UTF-8");
> >
> >     // I have tried many lines like the 2 below
> >     // (comment out one or the other)
> >     mb_substitute_character((int)0x3013);
> >     mb_substitute_character((int)63); // '?' is ascii 63
> >
> >     // \xC0\xBC is invalid UTF-8 - over long encoding, should be \x3C
> >     $input = "a bad angle bracket \xC0\xBC here";
> >     $valid = mb_convert_encoding($input, "UTF-8", "UTF-8");
> >
> >     // I always find 2 spaces between 'bracket' and 'here'
> >     echo "valid='$valid'\n";



-- 
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256  http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php





[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux