Re: Tweaking bytea / large object block sizes?

Bill Moran <wmoran@xxxxxxxxxxxxxxxxx> · Sun, 12 Jun 2011 19:21:53 -0400

On 6/12/11 12:00:19 PM, Hanno Schlichting wrote:
Hi.

I'm a new Postgres user. If I happen to ask stupid questions please
feel free to point me to any documentation I should read or guidelines
for asking questions.

I'm looking into storing binary data in Postgres and trying to
understand how data is actually stored in the database. The dataset
I'm looking at is images, photos, pdf documents which should commonly
be at a minimum 100kb, on average 10mb and can scale up to 100mb for
each document. I want to store this data in the database, as I need
transactional integrity and want to avoid the extra complexity of
managing shared filesystems between a number of frontend application
servers and database backends.

The binary data will only be accessed as a whole. So either a complete
new file is written to the DB or a complete file will be read and
cached on a frontend server. I don't need streaming access or be able
to stream partial data. The use-case seems to be well supported by
Oracle 11g with the introduction of "secure files" (pdf: [1]).

But from what I read of Postgres, my best bet is to store data as
large objects [2]. Going all the way down this means storing the
binary data as 2kb chunks and adding table row overhead for each of
those chunks. Using the bytea type and the toast backend [3] it seems
to come down to the same: data is actually stored in 2kb chunks for a
page size of 8kb.

I'm assuming I'll be able to get ~8kb jumbo frame packets over a
gigabit network connection and would be able to use a ext4 volume with
a block size of either 32kb or 64kb for the volume housing the binary
data, but a smaller block size for the one housing the relational
data.

Given those other constraints, it seems silly to split data up into
2kb chunks on the database level. Is there any way the chunk size for
binary data can be increased here independent of the one for "normal
relational" data?

You could redefine LOBLKSIZE and/or BLCKSZ such that the result was
larger chunks stored for large objects and then build PG from source.
I assume that LOBLKSIZE is defined as a multiple of BLKSIZE for a
reason, and that adjusting BLKSIZE is the better way to go.

Before doing this, I would set yourself up a performance test case so
that you can be sure that your changes are actually leading to an
improvement.  Since an RDBMS isn't typically used to simulate a
large-scale filestore, it's likely that he 8K page size and 2K LO
block size are inefficient if that's what it's being used for, but I
wouldn't assume that larger sizes automatically mean more performance
until you actually test it.

--
Bill Moran

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general