On 9/09/2010 2:48 AM, Donald Catanzaro, PhD wrote:
So, latitude is a double precision column and I think that PostgreSQL is
interpreting the double quote as a NULL string
No, it's interpreting it as an empty string, not NULL. I suspect that's
what you meant, but "NULL string" is still NULL, you're getting an
empty(zero-length) *non-null* string.
See: http://www.postgresql.org/docs/8.4/static/sql-copy.html
I'm not sure COPY ... CSV knows how to treat "" as null. It has options
to force empty unquoted text input columns to be *non* null, but I don't
see the reverse, a way to force the empty string to be treated as null.
While the documentation notes that "many programs produce strange and
occasionally perverse CSV files", this particular format quirk doesn't
seem that strange and I'm surprised to see no obvious way to handle it.
Perhaps this is a case where you might need to pre-process the csv input
to clean it up a bit? I usually use Python's "csv" module for that. I
imagine ETL tools like Pentaho or Talend can do the job, but have never
used them.
Issue B) I have an associated issue with a text value where the NULL in
the data being represented by a double quote (e.g. "") is being inputed
as a quote. I can not use the switch NULL AS '"' because PostgreSQL says
"the quote character must not appear in the NULL specification"
Again, PostgreSQL knows how to handle the reverse, where the empty
unquoted string is null, and "" is the zero-length non-null string. It
doesn't seem to be able to deal with "" as a representation of NULL, though.
At least this one, unlike the double precision issue, can be handled by
dropping the NOT NULL constraint on the text column concerned, then
running an
UPDATE tablename SET col=NULL WHERE col=""
and restoring the constraint.
Given the file sizes are huge, I would rather not have to try to
preprocess the data. Is there anyway the COPY FROM command can handle
this data smoothly ?
It doesn't look like it to me.
--
Craig Ringer
Tech-related writing at http://soapyfrogs.blogspot.com/
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general