PG 10 vs. 11: Large increase in memory usage when selecting BYTEA data (actually out of memory with PG11)

Matthias Otterbach <mo@xxxxxxxxxxxx> · Wed, 13 Mar 2019 11:24:00 +0000

Dear mailing list,

I am currently testing an application for which I previously used PostgreSQL 10 with the current PostgreSQL 11.2 release. During the tests I experienced out of memory errors of my database which I could not explain, they seem to affect tables containing large BYTEA data.

I was able to narrow them down to a small example query where I also experienced a much higher memory usage with PostgreSQL 11 compared to previous versions.

For my comparison I worked on a local VM with only 4 GB memory configured, a plain Ubuntu 18.04.2 installation (including all updates) and the current PostgreSQL 10.7 resp. 11.2 installation installed from http://apt.postgresql.org/pub/repos/apt/dists/bionic-pgdg/. I created a table containing a large amount of BYTEA data using these statements:

> create table public.a (b bytea);
> insert into public.a select repeat('0', 1024 * 1024 * 100)::bytea from generate_series(1, 300);
> select pg_size_pretty(sum(length(b))) from public.a;

In total I now have ~ 29 GB of data (actually 300 rows à 100 MB) and start selecting all data using a small Java program (current JDBC driver 42.2.5).

String sql = "SELECT b FROM public.a";
try (Connection connection = DriverManager.getConnection("jdbc:postgresql://127.0.0.1:5432/postgres?currentSchema=public", "username", "password")) {
  connection.setAutoCommit(false);
  try (PreparedStatement ps = connection.prepareStatement(sql)) {
  ps.setFetchSize(5); // we do not want to run out of Java heap space
  try (ResultSet rs = ps.executeQuery()) {
    int i = 0;
    while (rs.next()) {
    // just loop over all data, get the data and do something with it (actually we print a line every 10 rows containing the length, the other argument is never true with my test data
      i++;
      byte[] b = rs.getBytes(1);
      if (i % 10 == 0 || b == null || b.length <= i) {
        System.err.println("Row " + i + ": " + (b != null ? b.length : null));
      }
    }
  }
}

The Java program actually just executes SELECT b FROM public.a and keeps streaming more rows doing something with the content, also the execution plan for my query is fairly simple - actually it seems to be just a sequential scan (with both versions).

With PostgreSQL 10.7 the program went through fine (with plenty of free memory on my database VM, actually including the OS there was never used more than 1 GB on the VM). With PostgreSQL 11.2 the memory of my postgres process (pid of my session) keeps increasing and finally crashes after I fetched only about 8 GB of the data:

TopMemoryContext: 67424 total in 5 blocks; 12656 free (10 chunks); 54768 used
TableSpace cache: 8192 total in 1 blocks; 2096 free (0 chunks); 6096 used
TopTransactionContext: 8192 total in 1 blocks; 7744 free (1 chunks); 448 used
pgstat TabStatusArray lookup hash table: 8192 total in 1 blocks; 1456 free (0 chunks); 6736 used
RowDescriptionContext: 8192 total in 1 blocks; 6896 free (0 chunks); 1296 used
MessageContext: 8192 total in 1 blocks; 6752 free (1 chunks); 1440 used
Operator class cache: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used
smgr relation table: 16384 total in 2 blocks; 4600 free (2 chunks); 11784 used
TransactionAbortContext: 32768 total in 1 blocks; 32512 free (0 chunks); 256 used
Portal hash: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used
TopPortalContext: 8192 total in 1 blocks; 7392 free (0 chunks); 800 used
PortalContext: 1024 total in 1 blocks; 552 free (0 chunks); 472 used: C_1
ExecutorState: 4294976384 total in 17 blocks; 4080 free (0 chunks); 4294972304 used
printtup: 314581120 total in 3 blocks; 7936 free (8 chunks); 314573184 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
PortalContext: 1024 total in 1 blocks; 744 free (1 chunks); 280 used: 
Relcache by OID: 16384 total in 2 blocks; 3512 free (2 chunks); 12872 used
CacheMemoryContext: 524288 total in 7 blocks; 133056 free (5 chunks); 391232 used
[...]
Grand total: 4610628736 bytes in 192 blocks; 315312 free (137 chunks); 4610313424 used
2019-03-13 10:11:54.628 UTC [1350] postgres@postgres ERROR: out of memory
2019-03-13 10:11:54.628 UTC [1350] postgres@postgres DETAIL: Failed on request of size 268435456 in memory context "ExecutorState".
2019-03-13 10:11:54.628 UTC [1350] postgres@postgres STATEMENT: SELECT b FROM public.a

Am I even posting this to the right list (sorry if I choose the wrong one), I also already thought about filing a bug report, but it could be a bug in either PostgreSQL or the PG JDBC driver (I suspect the database itself as I use the same driver against PG 10). Would we expect PG 11 to use that much more memory than PG 10? Is it maybe a known bug (I did a quick look a pg-bugs but did not find any concerning this problem). 

Actually my process crashed after I've selected only about 80 rows - with PG 10 I was able to load all 300 rows (and even more data). Are there any configuration options I could play with? Actually almost all configuration options should be pretty much the default (at least for my testing VM), e.g. work_mem = 4MB, shared_buffers = 128MB.

Best regards
Matthias