Search Postgresql Archives

Re: Copy From csv file with double quotes as null

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/09/2010 2:48 AM, Donald Catanzaro, PhD wrote:
So, latitude is a double precision column and I think that PostgreSQL is
interpreting the double quote as a NULL string

No, it's interpreting it as an empty string, not NULL. I suspect that's what you meant, but "NULL string" is still NULL, you're getting an empty(zero-length) *non-null* string.

See: http://www.postgresql.org/docs/8.4/static/sql-copy.html

I'm not sure COPY ... CSV knows how to treat "" as null. It has options to force empty unquoted text input columns to be *non* null, but I don't see the reverse, a way to force the empty string to be treated as null.

While the documentation notes that "many programs produce strange and occasionally perverse CSV files", this particular format quirk doesn't seem that strange and I'm surprised to see no obvious way to handle it.

Perhaps this is a case where you might need to pre-process the csv input to clean it up a bit? I usually use Python's "csv" module for that. I imagine ETL tools like Pentaho or Talend can do the job, but have never used them.

Issue B) I have an associated issue with a text value where the NULL in
the data being represented by a double quote (e.g. "") is being inputed
as a quote. I can not use the switch NULL AS '"' because PostgreSQL says
"the quote character must not appear in the NULL specification"

Again, PostgreSQL knows how to handle the reverse, where the empty unquoted string is null, and "" is the zero-length non-null string. It doesn't seem to be able to deal with "" as a representation of NULL, though.

At least this one, unlike the double precision issue, can be handled by dropping the NOT NULL constraint on the text column concerned, then running an
  UPDATE tablename SET col=NULL WHERE col=""
and restoring the constraint.

Given the file sizes are huge, I would rather not have to try to
preprocess the data. Is there anyway the COPY FROM command can handle
this data smoothly ?

It doesn't look like it to me.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux