Re: RegEx to check for non-Latin characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Behzad,
I would try a different approach ...
EXAMPLE (UTF-8):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";><html xmlns="http://www.w3.org/1999/xhtml";><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>Persia</title></head><body><?php$username = '1aﺠﺟﺝﭻﭽﭼﭺ2b;$encoding = 'utf-8';$username = mbStringToArray($username, $encoding);foreach($username as $char) {       if (strlen($char) == 1) echo $char.' is not a multibyte character<br />';}function mbStringToArray ($string, $encoding) {   $strlen = mb_strlen($string);   while ($strlen) {       $array[] = mb_substr($string,0,1,$encoding);       $string = mb_substr($string,1,$strlen,$encoding);       $strlen = mb_strlen($string);   }   return $array;}?></body></html>
As you can see I'm using the multibyte string functions [1] and split$username into a character by character array.Then I use strlen() for which an UTF-8 char has a length > 1. Note:This might change with PHP6.It also does not check for Persian characters only yet. You would haveto try something like this ...
EXAMPLE (UTF-8):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";><html xmlns="http://www.w3.org/1999/xhtml";><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>Persia</title></head><body><?php$username = '1aﺠﺟﺝﭻﭽﭼﭺ2b';$encoding = 'utf-8';$chars = 'ﺐﺒﺑﺏﭗﭙﭙپﺖﺘﺗﺕﺚﺜﺛﺙﺞﺠﺟﺝﭻﭽﭼﭺﺢﺤﺣﺡﺦﺨﺧﺥﺪﺪﺩﺩﺬﺬﺫﺫﺮﺮﺭﺭﺰﺰﺯﺯﮋﮋژژﺲﺴﺳﺱﺶﺸﺷﺵﺺﺼﺻﺹﺾﻀﺿﺽﻂﻄﻃﻁﻆﻈﻇﻅﻊﻌﻋﻉﻎﻐﻏﻍﻒﻔﻓﻑﻖﻘﻗﻕﮏﮑﮐکﮓﮕﮔگﻞﻠﻟﻝﻢﻤﻣﻡﻦﻨﻧﻥﻮﻮووﻪﻬﻫﻩﯽﻴﻳﻯ';$username = mbStringToArray($username, $encoding);foreach($username as $char) {       if (strlen($char) == 1) echo $char.' is not a multibyte character<br />';       if (mb_strpos($chars, $char, 0, $encoding) !== false) echo $char.' isa Persian character<br />';}function mbStringToArray ($string, $encoding) {   $strlen = mb_strlen($string);   while ($strlen) {       $array[] = mb_substr($string,0,1,$encoding);       $string = mb_substr($string,1,$strlen,$encoding);       $strlen = mb_strlen($string);   }   return $array;}?></body></html>
[1] http://in.php.net/manual/en/ref.mbstring.php[2] http://in.php.net/manual/en/function.strlen.php

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux