Re: PostgreSQL: Copy from File missing data error

Ron <ronljohnsonjr@xxxxxxxxx> · Tue, 4 Sep 2018 13:26:00 -0500



    You might want to try pg_bulkload, and have it kick out malformed
    rows.

    
    It's packaged for RHEL6 and above, plus various other distros.

    
    On 09/04/2018 01:13 PM, Holly Gibons
      wrote:

    
          I'm using PostgreSQL 9.0 via pgAdmin III
          

            I'm trying to build a PostgreSQL/PostGIS database using Entire
              country files dataset but I'm getting missing data
            error 

            
            I'm wondering if the copy command is affected by diacritics
            or I've not set the database up properly 

            Created a new database with UTF8 encoding

            
            I built the table schema based on the given
              format (but using type text for NM_MODIFY_DATE not
            varchar,  having these as dates didn't make a difference) 

          
            I used large text viewer to open the uncompressed
            countries.txt file and copied the top 5 rows into a test
            file

            
            Using   PostgreSQL Copy this test file imported correctly so
            I know my schema is correct

            copy my_table(List of columns ) from '\\Path\\To\\TestFile.txt' WITH delimiter E'\t' csv HEADER;

          
          However when I tried to ingest the larger  countries.txt
            (2.9GB) file I get an error "missing data" for column xyz at
            line 12345 (Last column with  data in that row,
            NM_MODIFY_DATE) 

            
            Using large text viewer again I located this line and copied
            together with proceeding and following lines into my test
            file and tried the copy again but get the same error 

            
            I opened the test file in excel to  see if there is columns
             missing although not every row has data in each column they
            do  all match 

             
            the problem row has UFI -3373955 & UNI 6329083  

          
            I don't know if this is relevant but looking at the database
            properties , in pgAdmin,
            the 'collection' &  'Character type' are both set as
            "English_United Kingdom, 1252 " I  didn't set this and
            creating a  new DB the options are "C",  "English_United
            Kingdom, 1252 " or "POSIX" 

          
          Could someone suggest what I'm doing wrong?
          Thank you

        
    -- 

      Angular momentum makes the world go 'round.