How To: LARGE html text or csv file COPY FROM?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Friends,

We're trying something for the first time: A COPY into a database, from a TEXT (or CSV) file containing one really, really, big field of html.

The field happens to be content of complete webpages, which we then need to later analyze, slice, dice, etc. - so it's verbatim html, with all the carriage returns, spaces, linefeeds(?) and double quotes included!

Problem is: With the very first record, the COPY commands hiccups with: missing data from column error.
in CSV mode, it's 'extra data after last expected column'  (yes, using different input files for test).

Both errors above make sense to me; COPY is running into either a cr or a tab character in each case.

Q: Is there way to handle this directly, as a PG import? 

Meanwhile, we're off into using grep/gawk to remove all carriage returns in the field?

TIA for any help, inspiration, recipes (or time in the stocks).     Lou

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux