On Mon, Jun 13, 2011 at 3:27 AM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: > I would not even consider tweaking the internal block sizes until > you've determined there is a problem you expect you might solve by > doing so. It's not a problem as such, but managing data chunks of 2000 bytes + the hundreds of rows per object in the large_object table for 10mb objects seems like a lot of wasted overhead, especially if the underlying filesystem manages 32kb or 64kb blocks. My impression of those values was that they are a bit antiquated or are tuned for storing small variable character objects, but not anything I'd call "binary large objects" these days. > The single most important factor affecting blob performance > in postgres is how you send and receive the data -- you absolutely > want to use the binary protocol mode (especially for postgres versions > that don't support hex mode). ÂThe next thing to look at is using > bytea/large object -- large objects are a bit faster and have a higher > theoretical limit on size but byea is a standard type and this offers > a lot of conveniences -- I'd say stick with bytea unless you've > determined there is a reason not to. ÂThat said, if you are not > writing C some client side drivers might only allow binary > transmission through the lo interface so that's something to think > about. Thanks, I got as much from the docs and the blogosphere. We are going to use the large object interface. That seems to be the least we can do - especially to avoid some encoding overhead. We are storing bytes after all and not ascii characters so there should be no encoding at all. We aren't using SQL as the query interface as such but the Python bindings (http://www.initd.org/psycopg/) so we can take full advantage of the underlying large object API's and do pretty direct lo_import / lo_export calls. We are targeting at least Postgres 9.0, potentially going for 9.1 soon after it hits final. Once we get further in the project, we'll of course do some intensive benchmarking for the various options with our specific data and configuration. I'm just trying to understand what bits and bytes are actually stored and transferred behind all those API's. Hanno -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general