Re: Efficiently parsing a File

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/18/2014 10:57 AM, Tiago Hori wrote:
Hi Everyone,

I fairly new at this, so please bear with me. :)

I am building this web app for a project I am working at where I need to
store and process large amounts of data.

The data comes in comma delimited style. There are a bunch of headers that
I don't need and then the data. These files contain 96 times 96 entries and
I need to parse each one of those. Right now I have something working that
takes about 5 minutes to parse the whole file. Any tips on how to make this
run more efficiently would be greatly appreciated.

Thanks,

Tiago

here is the relevant code:



I cleaned up the code a little. Moved a few things around. Removed a few duplicate things and a part that didn't seem useful. Give this code a try:

if ( $_FILES['data'] ) {
    $dataSet = array();
    $parts = array();
    $c = 0;
    $filename = $_FILES['data']['name'];
    if ( file_exists($filename) ) {
        die ("A file with this name already exists in the database <br />");
    } else {
        move_uploaded_file($_FILES['data']['tmp_name'], $filename);
    }
    $runid = substr($filename, 0, -4);
    echo "Uploaded file '$runid' <br />";
    $fh = fopen($filename, 'r') or
         die("File does not exist or you lack permission to open it");
    echo '<table>';
    while ( !feof($fh) ) {
        $parts = fgetcsv($fh);
        if ( $parts[0] == 'ID' ) {
            $id = sanitizeStrings($parts[0]);
            $assay = sanitizeStrings($parts[1]);
            $allele1 = sanitizeStrings($parts[2]);
            $allele2 = sanitizeStrings($parts[3]);
            $name = sanitizeStrings($parts[4]);
        } else if ($t = preg_match("/S\d\d-\D\d\d/", $parts[0]) ) {
            $id = sanitizeStrings($parts[0]);
            $assay = sanitizeStrings($parts[1]);
            $alleles = sanitizeStrings($parts[9]);
            $allele1 = $allele2 = 'No Call';
            if ( preg_match("/[ATCG]:[ATCG]/", $alleles) ) {
                list($allele1, $allele2) = explode(':', $alleles, 2);
            }
            $name = sanitizeStrings($parts[4]);
            if ($name != 'Blank') {
$dataSet[] = "('{$runid}', '{$name}', '{$id}', '{$assay}', '{$allele1}', '{$allele2}')";
            }
        }
        echo <<<_END
<tr>
  <th>{$name}</th>
  <th>{$id}</th>
  <th>{$assay}</th>
  <th>{$allele1}</th>
  <th>{$allele2}</th>
</tr>
_END;

    }
    if ( $dataSet ) {
queryMysql("INSERT INTO genotyped (runid, fishid, plateid, assayid, allele1, allele2) VALUES ". join(', ', $dataSet));
    }
    fclose($fh);
    echo '</table>';
}



--
Jim Lucas

http://www.cmsws.com/
http://www.cmsws.com/examples/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php





[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux