RE: a better way to do a data import?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



what about uploading the entire file into a [semi]temp table..then doing cross table comparisons to load your main table with the data?
 
bastien> Date: Mon, 21 Jan 2008 13:08:27 -0500> From: blackwaterdev@xxxxxxxxx> To: php-general@xxxxxxxxxxxxx> Subject: Re:  a better way to do a data import?> > I think that's possible, so I'll give it a shot.> > For some reason, even with straight inserts my php script is dying around> 180,000 rows. Basically, I took out all the compare/update code so now I> grab the row from the db and if there isn't one, do an insert. I've wiped> my db so should do straight inserts so I'm not sure what's taking up the> memory.> > > > private function processFile($table, $key){> > $this->openFileForReading(); //foudn in GLFile class> while (!feof($this->fileHandle)) {> $file_data = fgets($this->fileHandle);> > $this->insert($table, $key, $file_data);> > }> $this->closeFile();> > }> > private function insert($table, $key, $data){> if (strlen($data)<10) return false;> > $data=$this->db->escape_string($data);> > //this is the data we will use should we need to do an insert> $insert_data = str_replace("\"", "'", $data);> > //this is a hack we need to change the separator of the file> //we need this because we need to put the data into an array and if> //we simply use the comma, then it splits address fields> $data = str_replace("\",\"", "~", $data);> $data = str_replace("\"","",$data); //let's remove the double quotes> $this->setDelimiter("~");> $dataToArray=$this->stringToArray($data);> //set it back for other functions> $this->setDelimiter(",");> > //get the id, we trust it is the first column> > > $key_data=$dataToArray[0];> > //does the value exist in the database already?> $sql="select * from prelim_$table where $key='$key_data'";> $handle=$this->db->select($sql);> if ($this->db->row_count($handle)>0){> $textData=array();> $colsToUpdate="";> $dataRow="";> $dataRow=$this->db->fetch_row($handle);> //now that we have the data, let's merge the row from the> //file with the column names> > $textData=array_combine($this->carrierColumns,> $dataToArray);> //cast some values that are strings in the text file> $textData['cars1']=(int) $textData['cars1'];> $textData['car_amount']=(int)> $textData['car_amount'];> > $textData['total_work']=trim($textData['total_work']);> > $textData['business_zip']=trim($textData['business_zip']);> //clean up some old db data> $dataRow['rfc_number']=trim($dataRow['rfc_number']);> > $dataRow['business_zip']=trim($dataRow['business_zip']);> > > > $colsToUpdate=array_diff($textData,$dataRow);> > //if we have columns to update, do it> if (count($colsToUpdate)>0){> > $colset="";> foreach ($colsToUpdate as $column=>$value){> $colset.="$column='$value',";> }> //strip last comma> $colset=substr($colset, 0, -1);> $sql="update prelim_$table set $colset where> $key='$key_data'";> $this->db->write($sql);> > }> > $dataRow=NULL;> $colsToUpdate=NULL;> $colset=NULL;> $textData=NULL;> }> else{> > //insert the row> $sql="insert into prelim_$table values (";> $sql.=trim($insert_data);> $sql.=");";> $this->db->write($sql);> }> > > > > }> > > On Jan 21, 2008 12:55 PM, Robert Cummings <robert@xxxxxxxxxxxxx> wrote:> > >> > On Mon, 2008-01-21 at 12:35 -0500, blackwater dev wrote:> > > I have a text file that contains 200k rows. These rows are to be> > imported> > > into our database. The majority of them will already exists while a few> > are> > > new. Here are a few options I've tried:> > >> > > I've had php cycle through the file row by row and if the row is there,> > > delete it and do a straight insert but that took a while.> > >> > > Now I have php get the row from the text file and then to array_combine> > with> > > a default array I have in the class so I can have key value pairs. I> > then> > > take that generated array and do array_diff to the data array I pulled> > from> > > the db and I then have the columns that are different so I do an update> > on> > > only those columns for that specific row. This is slow and after about> > > 180,000 rows, php throws a memory error. I'm resetting all my vars to> > NULL> > > at each iteration so am not sure what's up.> > >> > >> > > Anyone have a better way to do this? In MySQL, I could simply a replace> > on> > > each row...but not in postgres.> >> > Does Postgres support any method of temporarily disabling keys/indexing?> > Indexing is what causes the inserts to take a while. MySQL can optimize> > an import by locking the table and allowing the keys/indexes to be> > temporarily disabled. You'll see the following lines in recent MySQL> > database dumps surrounding the inserts:> >> > /*!40000 ALTER TABLE `xxx` DISABLE KEYS */;> > INSERT ...> > INSERT ...> > /*!40000 ALTER TABLE `xxx` ENABLE KEYS */;> >> > Cheers,> > Rob.> > --> > ...........................................................> > SwarmBuy.com - http://www.swarmbuy.com> >> > Leveraging the buying power of the masses!> > ...........................................................> >> >
_________________________________________________________________


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux