On 4 November 2010 15:11, robert mena <robert.mena@xxxxxxxxx> wrote: > Hi, > The core of the code is simply > $fp = fopen('file.tab', 'rb'); > while(!feof($fp)) > { > ÂÂ $line = fgets($fp); > ÂÂ $data = explode("\t", $line); > ÂÂ Â... > } > So I try to manipulate the $data[X]. ÂFor example $data[0] is supposed to be > numeric so I Â$n = (int) $data[0] > One other thing if the second column should contain a string. ÂIf I check > the string visually it is correct but a if( $data[1] == 'stringX') Âis false > even if in the file I can see this (and print those two) > I even did a md5 of both and they are different. > I seems to be an encoding issue. ÂIs it safe to use explode with utf8 > strings? > I even tried this code but no match found (jst to replace the explode) > $str = "abc æååã Â Âefg"; > $results = array(); > preg_match_all("/\t/u", $str, $results); > var_dump($results[0]); > On Thu, Nov 4, 2010 at 6:33 AM, Richard Quadling <rquadling@xxxxxxxxx> > wrote: >> >> On 3 November 2010 21:42, Alexander Holodny <alexander.holodny@xxxxxxxxx> >> wrote: >> > To exclude unexcepted behavior in case of wrongly formated input data, >> > it would be much better to use such type-casting method: >> > intval(ltrim(trim($inStr), '0')) >> > >> > 2010/11/3, Nicholas Kell <nick@xxxxxxxxxxxxxxxx>: >> >> >> >> On Nov 3, 2010, at 4:22 PM, robert mena wrote: >> >> >> >>> Hi, >> >>> >> >>> I have a text file (utf-8 encoded) which contains lines with numbers >> >>> and >> >>> text separated by \t. ÂI need to convert the numbers that contains 0 >> >>> (at >> >>> left) to integers. >> >>> >> >>> For some reason one line that contains 00000002 is casted to 0 instead >> >>> of >> >>> 2. >> >>> Bellow the output of the cast (int) $field[0] Âwhere I get this from >> >>> explode each line. >> >>> >> >>> 0 ï00000002 >> >>> 4 00000004 >> >> >> >> >> >> >> >> My first guess is wondering how you are grabbing the strings from the >> >> file. >> >> Seems to me like it would just drop the zeros on the left by default. >> >> Are >> >> you including the \t in the string by accident? If so, that may be >> >> hosing >> >> it. Otherwise, have you tried ltrim on it? >> >> >> >> Ex: >> >> >> >> $_castableString = ltrim($_yourString, '0'); >> >> >> >> // Now cast >> >> <?php >> // Create test file. >> $s_TabbedFilename = './test.tab'; >> file_put_contents($s_TabbedFilename, "0\t00000002" . PHP_EOL . >> "4\t00000004" . PHP_EOL); >> >> // Open test file. >> $fp_TabbedFile = fopen($s_TabbedFilename, 'rt') or die("Could not open >> {$s_TabbedFilename}\n"); >> >> // Iterate file. >> while(True) >> Â Â Â Â{ >> Â Â Â Âif (False !== ($a_Line = fgetcsv($fp_TabbedFile, 0, "\t"))) >> Â Â Â Â Â Â Â Â{ >> Â Â Â Â Â Â Â Âvar_dump($a_Line); >> Â Â Â Â Â Â Â Âforeach($a_Line as $i_Index => $m_Value) >> Â Â Â Â Â Â Â Â Â Â Â Â{ >> Â Â Â Â Â Â Â Â Â Â Â Â$a_Line[$i_Index] = intval($m_Value); >> Â Â Â Â Â Â Â Â Â Â Â Â} >> Â Â Â Â Â Â Â Âvar_dump($a_Line); >> Â Â Â Â Â Â Â Â} >> Â Â Â Âelse >> Â Â Â Â Â Â Â Â{ >> Â Â Â Â Â Â Â Âbreak; >> Â Â Â Â Â Â Â Â} >> Â Â Â Â} >> >> // Close the file. >> fclose($fp_TabbedFile); >> >> // Delete the file. >> unlink($s_TabbedFilename); >> >> >> outputs ... >> >> array(2) { >> Â[0]=> >> Âstring(1) "0" >> Â[1]=> >> Âstring(8) "00000002" >> } >> array(2) { >> Â[0]=> >> Âint(0) >> Â[1]=> >> Âint(2) >> } >> array(2) { >> Â[0]=> >> Âstring(1) "4" >> Â[1]=> >> Âstring(8) "00000004" >> } >> array(2) { >> Â[0]=> >> Âint(4) >> Â[1]=> >> Âint(4) >> } >> >> intval() operates as standard on base 10, so no need to worry about >> leading zeros' being thought of as base8/octal. >> >> What is your code? Can you reduce it to something as small like the >> above to see if you can repeat the issue? Please don't top post. With regards to utf-8 data, no, PHP is not unicode aware. If a multi-byte character is comprised of a 0x09 byte, then it will be broken. Can you supply the file you are working on? b64encode it and drop it into a pastebin. -- Richard Quadling Twitter : EE : Zend @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php