On 6/12/11 12:00:19 PM, Hanno Schlichting wrote:
Hi. I'm a new Postgres user. If I happen to ask stupid questions please feel free to point me to any documentation I should read or guidelines for asking questions. I'm looking into storing binary data in Postgres and trying to understand how data is actually stored in the database. The dataset I'm looking at is images, photos, pdf documents which should commonly be at a minimum 100kb, on average 10mb and can scale up to 100mb for each document. I want to store this data in the database, as I need transactional integrity and want to avoid the extra complexity of managing shared filesystems between a number of frontend application servers and database backends. The binary data will only be accessed as a whole. So either a complete new file is written to the DB or a complete file will be read and cached on a frontend server. I don't need streaming access or be able to stream partial data. The use-case seems to be well supported by Oracle 11g with the introduction of "secure files" (pdf: [1]). But from what I read of Postgres, my best bet is to store data as large objects [2]. Going all the way down this means storing the binary data as 2kb chunks and adding table row overhead for each of those chunks. Using the bytea type and the toast backend [3] it seems to come down to the same: data is actually stored in 2kb chunks for a page size of 8kb. I'm assuming I'll be able to get ~8kb jumbo frame packets over a gigabit network connection and would be able to use a ext4 volume with a block size of either 32kb or 64kb for the volume housing the binary data, but a smaller block size for the one housing the relational data. Given those other constraints, it seems silly to split data up into 2kb chunks on the database level. Is there any way the chunk size for binary data can be increased here independent of the one for "normal relational" data?
You could redefine LOBLKSIZE and/or BLCKSZ such that the result was larger chunks stored for large objects and then build PG from source. I assume that LOBLKSIZE is defined as a multiple of BLKSIZE for a reason, and that adjusting BLKSIZE is the better way to go. Before doing this, I would set yourself up a performance test case so that you can be sure that your changes are actually leading to an improvement. Since an RDBMS isn't typically used to simulate a large-scale filestore, it's likely that he 8K page size and 2K LO block size are inefficient if that's what it's being used for, but I wouldn't assume that larger sizes automatically mean more performance until you actually test it. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general