On Wed, 22 May 2019, Francisco Olarte wrote:
You are not reading what we write to you. Note YOU AND ONLY YOU are the one speaking of PK. We are speaking of "unique identifier" ( that would be, IIRC, "candidate keys", you can peek any as your PK, or even introduce a new synthetic one with a sequence, or a femto second exact timestamp or whatever ).
Francisco, Let me clarify. The organizations table has org_id (an integer) as PK. The people table has person_id (an interger) as PK and org_id as the reference to organization.org_id. Does this help?
When you are fluent in SQL you do not try to play with files, you import every column of your data into temporary tables, clean them up, and join ( if needed ) them until you have a select that gives you what you want and then insert this. Normally you insert several SELECTS into temporary tables ( specially when you only have thousands of records ) so you can do the clean up in steps.
Most of my time is spent writing using LaTeX/LyX. Depending on the project's needs I'll also use SQL, R, GRASS, and other tools. I'm a generalist, like your PCP, not a specialist. But, I also rely on emacs, grep, sed, and awk for data munging and am more fluent with these tools than I am with SQL or Python. For me, the quickest and simplest appoach is to add the PKs to each table, and the org_id into the people table, when I separate the cleaned text file into the columns for each table. Regards, Rich