Command-line used to invoke pg_dump:
pg_dump -v -F c -x -O -f "path/to/file" -h db_host -p db_port -U user_name database_name
A temporary file is created in a secure location, and used with PGPASSFILE to avoid the password prompt. Here is the error found in the pg_dump log:
pg_dump: dumping contents of table XYZ (edited name)
pg_dump: [custom archiver] could not write to output file: File too large
pg_dump: *** aborted because of error
The left-over partially-created file is a suspicious 2147483647 bytes in size. The DB is running on 32-bit RHEL 5.3 (Linux-2.6.18-128.1.16). I don't know the libc version at the moment, but it would be whatever was shipped with the distro. Here is the output of ulimit on that system:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 77823
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 77823
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 77823
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 77823
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The pg_dump binary was compiled on a different system, 32-bit RHEL4 (Linux-2.6.9-78.0.1) against glibc 2.3.4.
I don't have exact numbers at the moment, but the database is approximately 150 GB in size. Roughly 2/3rds of that is occupied by one table with approximately 2 billion rows, which was dumped successfully according to the pg_dump log. Most of the remainder is occupied by table XYZ, with approximately 200 million rows.
Since the software is cross-platform, I'd prefer to avoid work-arounds such as "pipe it through split" unless tracking down the problem itself fails. Also, since the machine running the database belongs to a user, I'm limited in what I can "try out" on it. If changes are needed for a solution, that's fine, but changes purely for diagnosis, i.e. "try X and see if that works", are very difficult. I'm aware that the kernel/libc that we build against is rather old, but it shouldn't be so old as to be a problem. Although I have more control over that machine, updating it unnecessarily is undesirable, and I can only do it if it would contribute to a solution. Finally, migrating to PG 8.4 is not an option in the short-term, unless it would definitely fix the problem.
I know these restrictions may make it much harder to diagnose the issue, so I apologize in advance. Thank you for any help you can provide,
David