Thanks.
> You mean 250M rows in pg_largeobject itself, or 250M large objects
(that is, 250M rows in pg_largeobject_metadata)?
(that is, 250M rows in pg_largeobject_metadata)?
250M large objects.
> Are you sure you're using a 64-bit build of pg_dump?
file /usr/lib/postgresql/10/bin/pg_dump
/usr/lib/postgresql/10/bin/pg_dump: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=4ff6bf0dd628d151481256723994dd50785e54e9, stripped
/usr/lib/postgresql/10/bin/pg_dump: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=4ff6bf0dd628d151481256723994dd50785e54e9, stripped
Also it really eats all the memory.
> Using the large-object API for things that tend to not actually be very large (which they aren't, if you've got hundreds of millions of 'em)
is an antipattern, I'm afraid.
I know :( So maybe I need to do some refactoring in the application and inline the lobs. The data is binary data with very high entropy (encrypted stuff). Would you recommend bytea for that?
Thanks again.
Mate
On Tue, Aug 14, 2018 at 7:58 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Mate Varga <m@xxxxxxxxxxxxx> writes:
> We have a PSQL 9.5 DB with 16G physical RAM and ~ 1 TB data mostly stored
> in the pg_largeobject system table. This table has 250M rows at the moment.
You mean 250M rows in pg_largeobject itself, or 250M large objects
(that is, 250M rows in pg_largeobject_metadata)?
> This command
> fails because of an OOM. Logs say:
> pg_dump: [archiver (db)] query failed: out of memory for query result
> pg_dump: [archiver (db)] query was: SELECT oid, (SELECT rolname FROM
> pg_catalog.pg_roles WHERE oid = lomowner) AS rolname, lomacl, NULL AS
> rlomacl, NULL AS initlomacl, NULL AS initrlomacl FROM
> pg_largeobject_metadata
The selected rows shouldn't be very wide here, so I'm a bit surprised
you are hitting OOM, even if there are 250M rows to read. Are you sure
you're using a 64-bit build of pg_dump?
> Would it help if we'd inline these largeobjects as e.g. text cols (instead
> of storing them as lobs)?
Yes, because then they'd not have their own ownership and permissions
for pg_dump to keep track of. (You might want bytea instead of text,
depending on what the data is.)
Using the large-object API for things that tend to not actually be very
large (which they aren't, if you've got hundreds of millions of 'em)
is an antipattern, I'm afraid. You could get away with it before we
added per-largeobject permissions, but now it's a problem for pg_dump.
regards, tom lane