Re: Efficiently parsing a File

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jim, 

One quick question:

When you use this call to the mysql database ("INSERT INTO genotyped (runid, fishid, plateid, assayid, allele1, allele2) VALUES ". join(', ', $dataSet));

There are more than 6 items in the $dataSet array as the while loop parsed the whole file, correct? I just want to make sure I understand correctly. Does that INSERT command knows to start another row in the table after it inserted the first 6 values?

Thanks!

Tiago


On Mar 18, 2014, at 3:38 PM, Jim Lucas <lists@xxxxxxxxx> wrote:

> On 03/18/2014 10:57 AM, Tiago Hori wrote:
>> Hi Everyone,
>> 
>> I fairly new at this, so please bear with me. :)
>> 
>> I am building this web app for a project I am working at where I need to
>> store and process large amounts of data.
>> 
>> The data comes in comma delimited style. There are a bunch of headers that
>> I don't need and then the data. These files contain 96 times 96 entries and
>> I need to parse each one of those. Right now I have something working that
>> takes about 5 minutes to parse the whole file. Any tips on how to make this
>> run more efficiently would be greatly appreciated.
>> 
>> Thanks,
>> 
>> Tiago
>> 
>> here is the relevant code:
>> 
>> 
> 
> I cleaned up the code a little.  Moved a few things around.  Removed a few duplicate things and a part that didn't seem useful. Give this code a try:
> 
> if ( $_FILES['data'] ) {
>    $dataSet = array();
>    $parts = array();
>    $c = 0;
>    $filename = $_FILES['data']['name'];
>    if ( file_exists($filename) ) {
>        die ("A file with this name already exists in the database <br />");
>    } else {
>        move_uploaded_file($_FILES['data']['tmp_name'], $filename);
>    }
>    $runid = substr($filename, 0, -4);
>    echo "Uploaded file '$runid' <br />";
>    $fh = fopen($filename, 'r') or
>         die("File does not exist or you lack permission to open it");
>    echo '<table>';
>    while ( !feof($fh) ) {
>        $parts = fgetcsv($fh);
>        if ( $parts[0] == 'ID' ) {
>            $id = sanitizeStrings($parts[0]);
>            $assay = sanitizeStrings($parts[1]);
>            $allele1 = sanitizeStrings($parts[2]);
>            $allele2 = sanitizeStrings($parts[3]);
>            $name = sanitizeStrings($parts[4]);
>        } else if ($t = preg_match("/S\d\d-\D\d\d/", $parts[0]) ) {
>            $id = sanitizeStrings($parts[0]);
>            $assay = sanitizeStrings($parts[1]);
>            $alleles = sanitizeStrings($parts[9]);
>            $allele1 = $allele2 = 'No Call';
>            if ( preg_match("/[ATCG]:[ATCG]/", $alleles) ) {
>                list($allele1, $allele2) = explode(':', $alleles, 2);
>            }
>            $name = sanitizeStrings($parts[4]);
>            if ($name != 'Blank') {
>                $dataSet[] = "('{$runid}', '{$name}', '{$id}', '{$assay}', '{$allele1}', '{$allele2}')";
>            }
>        }
>        echo <<<_END
> <tr>
>  <th>{$name}</th>
>  <th>{$id}</th>
>  <th>{$assay}</th>
>  <th>{$allele1}</th>
>  <th>{$allele2}</th>
> </tr>
> _END;
> 
>    }
>    if ( $dataSet ) {
>        queryMysql("INSERT INTO genotyped (runid, fishid, plateid, assayid, allele1, allele2) VALUES ". join(', ', $dataSet));
>    }
>    fclose($fh);
>    echo '</table>';
> }
> 
> 
> 
> -- 
> Jim Lucas
> 
> http://www.cmsws.com/
> http://www.cmsws.com/examples/


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux