Re: Bulk inserts into two (related) tables

Rich Shepard <rshepard@xxxxxxxxxxxxxxx> · Wed, 22 May 2019 07:38:29 -0700 (PDT)

On Wed, 22 May 2019, Jeremy Finzel wrote:

There's absolutely no need to use anything beyond SQL here, though you
could if you want to.

Jeremy,

This is a new experience for me so I didn't think of a SQL solution.

I really wonder how much we are just talking past each other simply because
we don't know what your data looks like, so we can't show you how our
examples apply to your use case. If you provided a sample scrubbed data
file, this whole thread probably would have been much shorter :).  Can you
do that?

Not necessary; see below. Also, these data come from a regulator and
provided as an Excel spreadsheet. If they were extracted from a database
then that was very poorly designed because there's no consistency in how
fields/columns are formatted. This requires manual cleaning.

Each row in the source file (exported from the spreadsheet as .csv and
renamed to .txt for processing in emacs and awk) is a mixture of attributes
that belong in either or both of the organization and people tables in my
database. An awk script will extract the appropriate fields for each table.

You told Francisco that the data file does not have a unique org name that
could be used as a unique organization identifier. However you seem to
have contradicted that by responding favorably to this solution:

The org_name is not the PK; the org_id is. This was assigned by postgres
when the original rows were inserted. Now, I can add the org_id in the
values to be inserted as I know the maximum org_id number in that table.

INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id FROM
org WHERE org_name=‘Main Office’))

Question: do I use this same syntax for each row to be inserted or can I
make it one long insert statement by separating the parenthesized values
with commas as I do when I update multiple rows in a table?

Thanks very much,

Rich