Re: Bulk inserts into two (related) tables

Rich Shepard <rshepard@xxxxxxxxxxxxxxx> · Wed, 22 May 2019 12:24:44 -0700 (PDT)

On Wed, 22 May 2019, Adrian Klaver wrote:

A sample of the data you are cleaning up.

Adrian, et al.:

I have it working properly now. Both org_id and person_id numbers are
prepended to each row in the appropriate table and they are unique because
each series begins one greater than the max(*_id) in each table.

I think what people are trying to wrap there head around is how 800 lines
in the file is being split into two subsets: the organization data and the
people data. In particular how that is being done to preserve the
relationship between organizations and people? This is before it ever gets
to the database.

After cleaning there are 655 lines rather than 800, but the short answer is
that spliting the data preserves the relationship between organization and
its people:

#!/usr/bin/gawk

# Read input file, write fields to both organizations and people
# input files.

BEGIN { FS=OFS="," }
# for organizations table input:
{ print $1, $2, $4, "," $8, $9, $10, "'US'," $11, ",,," "'Opportunity','');" > "z-orgs.sql" }
# for people table input:
{ print $6, $5, "," $1, $3, $4, $5, $7, $8, $9, $10, "'US'," $11, "," $12, "," $13, "'true','');" > "z-people.sql" }

You can see that the org_id field ($1) is used in both files, but in
different columns in the two tables.

Regards,

Rich