Does pg_dump have a 2GB file-size limit?

David Schnur <dnschnur@xxxxxxxxx> · Mon, 14 Sep 2009 13:39:54 -0400

I develop a piece of software that uses PostgreSQL (8.3.5) as a back-end database.  The software can, optionally, use pg_dump to create snapshots of the database.  One user has run into a problem where pg_dump dumps 2GB, then claims that the archive is too large.  I haven't yet found documentation for pg_dump's internal limits, and was wondering if anyone could shed light on this.  I'm not an expert on Linux filesystems, so perhaps I'm just missing something obvious, but I was under the impression that default ext3, with a 4K block size, supported files up to 2TB in size.

Command-line used to invoke pg_dump:

pg_dump -v -F c -x -O -f "path/to/file" -h db_host -p db_port -U user_name database_name

A temporary file is created in a secure location, and used with PGPASSFILE to avoid the password prompt.  Here is the error found in the pg_dump log:

pg_dump: dumping contents of table XYZ  (edited name)
pg_dump: [custom archiver] could not write to output file: File too large
pg_dump: *** aborted because of error

The left-over partially-created file is a suspicious 2147483647 bytes in size.  The DB is running on 32-bit RHEL 5.3 (Linux-2.6.18-128.1.16).  I don't know the libc version at the moment, but it would be whatever was shipped with the distro.  Here is the output of ulimit on that system:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited

pending signals                 (-i) 77823
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8

POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 77823

virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

The pg_dump binary was compiled on a different system, 32-bit RHEL4 (Linux-2.6.9-78.0.1) against glibc 2.3.4.

I don't have exact numbers at the moment, but the database is approximately 150 GB in size.  Roughly 2/3rds of that is occupied by one table with approximately 2 billion rows, which was dumped successfully according to the pg_dump log.  Most of the remainder is occupied by table XYZ, with approximately 200 million rows.

Since the software is cross-platform, I'd prefer to avoid work-arounds such as "pipe it through split" unless tracking down the problem itself fails.  Also, since the machine running the database belongs to a user, I'm limited in what I can "try out" on it.  If changes are needed for a solution, that's fine, but changes purely for diagnosis, i.e. "try X and see if that works", are very difficult.  I'm aware that the kernel/libc that we build against is rather old, but it shouldn't be so old as to be a problem.  Although I have more control over that machine, updating it unnecessarily is undesirable, and I can only do it if it would contribute to a solution.  Finally, migrating to PG 8.4 is not an option in the short-term, unless it would definitely fix the problem.

I know these restrictions may make it much harder to diagnose the issue, so I apologize in advance.  Thank you for any help you can provide,

David