On Mon, Jul 6, 2009 at 4:54 AM, SleePy <sleepingkiller@xxxxxxxxx> wrote: > I seem to be having a minor issue with preg_replace not working as expected > when using UTF-8 strings. So far I have found out that \w doesn't seem to be > detecting UTF-8 strings. > > This is my test php file: > <?php > $data = 'ooooooooooooooooooooooo'; > echo 'Data before: ', $data, '<br />'; > > $data = preg_replace('~([\w\.]{6})~u', '$1 < >', $data); > echo 'Data After: ', $data; > > // UTF-8 Test > $data = 'ффффффффффффффффффффффф'; > echo '<hr />Data before: ', $data, '<br />'; > > $data = preg_replace('~([\w\.]{6})~u', '$1 < >', $data); > echo 'Data After: ', $data; > > ?> > > > I would expect it to be: > Data before: ooooooooooooooooooooooo > Data After: oooooo < >oooooo < >oooooo < >ooooo > --- > Data before: ффффффффффффффффффффффф > Data After: фффффф <>фффффф <>фффффф<> ффффф > > But what I get is: > Data before: ooooooooooooooooooooooo > Data After: oooooo < >oooooo < >oooooo < >ooooo > --- > Data before: ффффффффффффффффффффффф > Data After: ффффффффффффффффффффффф > > Did I go about this the wrong way or is this a php bug itself? > I tested this in php 5.3, 5.2.9 and 6.0 (snapshot from a couple weeks ago) > and received the same results. Did you tried mb_ereg_replace? > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >