Usage of strlen(tuf8_decode()) and "/u" regex modifier

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

  As indicated below, the "strlen(tuf8_decode())" and the "/u" regex 
  modifier do not work as per my understanding.  

  1) What is my misunderstanding?  

      <?php
      
          $the_string = '&#1052;&#1072;&#1088;&#1080;&#1085;&#1072; &#1054;&#1088;&#1083;&#1086;&#1074;&#1072;';
          echo "<p>author (85 bytes):$the_string," . strlen($the_string) . ',' . strlen( utf8_decode( $the_string ) ) . ',' .
strlen( utf8_decode( utf8_encode($the_string) ) ) . ',' .  "</p>";
          // all the number echoed are 85, I expected at least one to be 13

          
          $max_length = 20;
          $is_short = preg_match( '/^.{1,$max_length}$/u', uft8_encode( $the_string ) ) );
          // expect the above to return 1
          
          $max_length = 10;
          $is_short = preg_match( '/^.{1,$max_length}$/u', uft8_encode( $the_string ) ) );
          // expect the above to return 0
      
      ?>

  More generally, given a string $the_string:

  2) how to determine what encoding is being used?

  3) how to determine the number of visible characters?

  4) if it has more than N visible characters, how to 
     truncate it after N visible characters?

  Thanks!


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux