Re: a better way to do a data import?

"Eric Butera" <eric.butera@xxxxxxxxx> · Mon, 21 Jan 2008 12:55:05 -0500

On Jan 21, 2008 12:35 PM, blackwater dev <blackwaterdev@xxxxxxxxx> wrote:
> I  have a text file that contains 200k rows.   These rows are to be imported
> into our database.  The majority of them will already exists while a few are
> new.  Here are a few options I've tried:
>
> I've had php cycle through the file row by row and if the row is there,
> delete it and do a straight insert but that took a while.
>
> Now I have php get the row from the text file and then to array_combine with
> a default array I have in the class so I can have key value pairs.  I then
> take that generated array and do array_diff to the data array I pulled from
> the db and I then have the columns that are different so I do an update on
> only those columns for that specific row.  This is slow and after about
> 180,000 rows, php throws a memory error.  I'm resetting all my vars to NULL
> at each iteration so am not sure what's up.
>
>
> Anyone have a better way to do this?  In MySQL, I could simply a replace on
> each row...but not in postgres.
>
> Thanks!
>

If it is a memory error perhaps you could just do a select of the
primary key in your existing recordset first.  Then you could iterate
over each of the new records to be inserted and say if !isset(new
record pk) then insert.  This way the only thing in memory is the
current primary keys and a single new record.  This might not work
though.

If that doesn't work, perhaps you could know that your primary key is
auto incrementing.  On each update just insert records that are
greater than the previous max value?

One possible other solution.  I have one data set that I have to
update each night from an XML feed.  I just rip through the file and
insert the records with a given timestamp.  Then once I am complete I
delete all the previous records with a timestamp that is less than the
current insert.  This is for a very specific thing though and might
not be right for your needs.

Good luck!

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php