Hello,
I am building up a PostgreSQL server which I intend to load the
entirety of the pubmed database data (23GB bzip2 compressed, 216GB
unpacked) which is available in xml form of which, here is an example:
https://www.ncbi.nlm.nih.gov/pubmed/21833294?report=xml& format=text
I looked at the documentation as well as several examples code for
loading the data and the one example who nearly succeeded is this
procedure:
/usr/bin/psql medline
\set :largexmlfile: 'cat /srv/pgsql/pubmed/medline17n0001.xml'
INSERT INTO samples (xmldata) VALUES :largexmlfile:
I'll assume you've just mis-keyed this from memory since the syntax of the above doesn't like right.
(from reading the list post here:
https://www.postgresql.org/message-id/20160624083757. )GA5459%40msg.df7cb.de
In which, about 334MB of data from medline17n0001.xml will flood the
monitor.
If the above general command sequence is done right, and echoing of commands is turned off, you should not see any of the XML file content echoed to the output.
it is possible to turn off validation of the content between the xml
tags of the files.
You can either turn off validation for the entire file or leave it on - PostgreSQL isn't recognizing tags here (you haven't defined the samples table for us...).
Narrowing down the entire file to a small problem region and posting a self-contained example, or at least providing the error messages and content, might help elicit good responses. Even if you could load the data without incident using it make end up proving problematic. That said character encodings and sets are not my strong suit.
David J.