On Wed, 22 May 2019, Jeremy Finzel wrote:
There's absolutely no need to use anything beyond SQL here, though you could if you want to.
Jeremy, This is a new experience for me so I didn't think of a SQL solution.
I really wonder how much we are just talking past each other simply because we don't know what your data looks like, so we can't show you how our examples apply to your use case. If you provided a sample scrubbed data file, this whole thread probably would have been much shorter :). Can you do that?
Not necessary; see below. Also, these data come from a regulator and provided as an Excel spreadsheet. If they were extracted from a database then that was very poorly designed because there's no consistency in how fields/columns are formatted. This requires manual cleaning. Each row in the source file (exported from the spreadsheet as .csv and renamed to .txt for processing in emacs and awk) is a mixture of attributes that belong in either or both of the organization and people tables in my database. An awk script will extract the appropriate fields for each table.
You told Francisco that the data file does not have a unique org name that could be used as a unique organization identifier. However you seem to have contradicted that by responding favorably to this solution:
The org_name is not the PK; the org_id is. This was assigned by postgres when the original rows were inserted. Now, I can add the org_id in the values to be inserted as I know the maximum org_id number in that table.
INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id FROM org WHERE org_name=‘Main Office’))
Question: do I use this same syntax for each row to be inserted or can I make it one long insert statement by separating the parenthesized values with commas as I do when I update multiple rows in a table? Thanks very much, Rich