On Jan 21, 2008 12:35 PM, blackwater dev <blackwaterdev@xxxxxxxxx> wrote: > I have a text file that contains 200k rows. These rows are to be imported > into our database. The majority of them will already exists while a few are > new. Here are a few options I've tried: > > I've had php cycle through the file row by row and if the row is there, > delete it and do a straight insert but that took a while. > > Now I have php get the row from the text file and then to array_combine with > a default array I have in the class so I can have key value pairs. I then > take that generated array and do array_diff to the data array I pulled from > the db and I then have the columns that are different so I do an update on > only those columns for that specific row. This is slow and after about > 180,000 rows, php throws a memory error. I'm resetting all my vars to NULL > at each iteration so am not sure what's up. > > > Anyone have a better way to do this? In MySQL, I could simply a replace on > each row...but not in postgres. > > Thanks! > If it is a memory error perhaps you could just do a select of the primary key in your existing recordset first. Then you could iterate over each of the new records to be inserted and say if !isset(new record pk) then insert. This way the only thing in memory is the current primary keys and a single new record. This might not work though. If that doesn't work, perhaps you could know that your primary key is auto incrementing. On each update just insert records that are greater than the previous max value? One possible other solution. I have one data set that I have to update each night from an XML feed. I just rip through the file and insert the records with a given timestamp. Then once I am complete I delete all the previous records with a timestamp that is less than the current insert. This is for a very specific thing though and might not be right for your needs. Good luck! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php