Jason Newton <nevion@xxxxxxxxx> writes: > I spent some time trying to get things to work as is, raising what limits I > could to no avail. So I decided to upgrade to 9.3 and use large binary > objects rather than making another file store due to a large convenience > of keeping everything in database. I noticed that my 35GB of files has > become 46GB of files, and there are 18522822 enteries in pg_largeobject > where as I only have 257 files ranging from 30MB to 400MB. To reiterate > the data is compressed via several HDF filers, so postgres isn't going to > do any better. It looks like unless there's 30% overhead for using > pg_largeobject which is pretty expensive! If your input data is uniformly incompressible, that's not too surprising. pg_largeobject tuples hold BLCKSZ/4 bytes of data, plus some overhead, so the only way that 4 of them will fit on a page is if compression saves more than the overhead. You only need a couple percent compression savings to make that work, but precompressed data might resist being compressed even that much. So you end up with only 3 tuples per page which makes it right about 33.33% overhead. > 1. How can I disable TOAST compression or whatever is going on in > pg_largeobject? The problem is that you're not getting any compression, so "disabling" it won't help. > 2. How in the world is 2KB block size for _large_ binary object a > reasonable size? As far as I can tell, it is introducing a very large > overhead. According to the commentary in the source code (which you've evidently read to some extent), a smaller block size was chosen to reduce the overhead of partial updates of large-object data. You're probably not going to do that, but PG doesn't know it. In any case, you can't choose a LOBLKSIZE exceeding, or even equaling, the page size; so there's not room for a huge change here. According to the above analysis, if you want to pack more LO data per page, you'd actually be better off with a *smaller* LOBLKSIZE to limit the per-page wastage. > 3. Should I be changing LOBLKSIZE if this is the main factor of the 30% > overhead? Is this straight forward? Long term, is that going to bite me > back in the behind? Yeah, it could, mainly because accidentally running a postmaster with the wrong LOBLKSIZE compiled in would be disastrous. We should have a cross-check on that value, and I'll see to it that one gets into future releases, but it's not there today. What would likely be the best bet for you is to run a modified build with LOBLKSIZE held at 2K and BLCKSZ bumped up to 32K. That would cut the wastage from 2K per 8K to 2K per 32K. Operationally, it'd be safer since there *is* a pg_control cross-check on BLCKSZ, so you could not accidentally start a standard build against the database. Or you could just live with it. 10GB of disk is cheap insurance against human error, and you're definitely taking some risk of human error if you run a locally-patched build. regards, tom lane