Thanks for the reply Steve. These suggestions are new to me, so I'd like to rephrase them back to you in order to make sure I understand the bits and details. On Dec 19, 2012, at 10:48 AM, Steve Crawford <scrawford@xxxxxxxxxxxxxxxxxxxx> wrote: I suppose you could use a trigger to check each record before inserting but that is likely to be inefficient for bulk loads. A quick bash loop is probably your best bet. Something along the lines of: I get this… If my except for the header… bit. Here is my interpretation of your code including my specific connection parameters. #!/bin/sh for inputfile in '/Volumes/disk7/b4warmed3/export/60min2/*.txt' do cat inputfile | psql -p 54321 -h localhost -c '\copy rawinput FROM stdin WITH CSV HEADER DELIMTER AS ',' NULL AS 'NA ' done I added single quotes around the path to the input files. Correct right?
This bit needs to be as a separate step right? (rowid is the primary key) SELECT DISTINCT ON (rowid) * FROM rawinput; From here do I need another COPY FROM or some kind of INSERT statement?
I understand that I need to create the rawinput table first, but I am unfamiliar with the "UNLOGGED" option. I assume it makes things faster… Does it go something like: CREATE TABLE UNLOGGED rawinput; Do I need to create all the variables (including types) in rawinput as well? If so, then I assume that I do not want rowid to have a primary key… or else I would be back where I started.
Thanks again, Kirk |