Re: preg_replace with UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 6, 2009 at 4:54 AM, SleePy <sleepingkiller@xxxxxxxxx> wrote:

> I seem to be having a minor issue with preg_replace not working as expected
> when using UTF-8 strings. So far I have found out that \w doesn't seem to be
> detecting UTF-8 strings.
>
> This is my test php file:
> <?php
> $data = 'ooooooooooooooooooooooo';
> echo 'Data before: ', $data, '<br />';
>
> $data = preg_replace('~([\w\.]{6})~u', '$1 < >', $data);
> echo 'Data After: ', $data;
>
> // UTF-8 Test
> $data = 'ффффффффффффффффффффффф';
> echo '<hr />Data before: ', $data, '<br />';
>
> $data = preg_replace('~([\w\.]{6})~u', '$1 < >', $data);
> echo 'Data After: ', $data;
>
> ?>
>
>
> I would expect it to be:
> Data before: ooooooooooooooooooooooo
> Data After: oooooo < >oooooo < >oooooo < >ooooo
> ---
> Data before: ффффффффффффффффффффффф
> Data After: фффффф <>фффффф <>фффффф<> ффффф
>
> But what I get is:
> Data before: ooooooooooooooooooooooo
> Data After: oooooo < >oooooo < >oooooo < >ooooo
> ---
> Data before: ффффффффффффффффффффффф
> Data After: ффффффффффффффффффффффф
>
> Did I go about this the wrong way or is this a php bug itself?
> I tested this in php 5.3, 5.2.9 and 6.0 (snapshot from a couple weeks ago)
> and received the same results.


Did you tried mb_ereg_replace?


>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux