Re: Help needed with mb_convert_encoding()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alain,

I should advertise
patchwork/utf8<https://github.com/nicolas-grekas/Patchwork-UTF8>here
:)

I am trying to use this to validate input that is supposed to be UTF-8 and
> to
> replace any bad characters with something - '?' would do.
>

I'd personally take an other way: ill formed utf-8 sequences do not exist
in non bugged applications. What is much more common is an application that
sends something else than UTF-8. If you don't know what your input charset
encoding is, then HTML5 tells us that CP1252 is a good default fall-back.

Thus, I'd recommend using this snippet (preg_match is the quickest way to
check for utf-8 in PHP):

$input = "...";

if (!preg_match('//u', $input)) {
    $input = iconv('CP1252', 'UTF-8', $input);
}

// here, $input is always an utf-8 string

Regards,
Nicolas

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux