Search Postgresql Archives

Re: Binary Large Objects (LOB/BLOB) in Hibernate and JDBC: Unresolved issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 8 Jan 2012 21:57:37 +0100, Stefan Keller wrote:
Thanks, Radosław, for the clarification.

2012/1/8 Radosław Smogura <rsmogura@xxxxxxxxxxxxxxx> wrote:
3. pg_largeobjects is system table, hibernate do not creates it.

4. Trigger based approach is good for this, but You need to do this mannualy

That's exactly my point:
Since JDBC manages creation of OID and pg_largeobjects it's also JDBC
which is responsible for update/delete of rows in pg_largeobjects.
As the user expects, the only thing JDBC would have to do is to issue
lo_unlink() when rows are updated or deleted.
Thus, it's currently a bug in the JDBC driver. And whatever the
solution is, it needs to be mentioned in the JDBC docs.
Not quite, PostgreSQL doesn't have LOB, nor OID type that is only reference to LOB. In fact, BLOB behaviour in JDBC is just thin wrapper for this what is missing in PostgreSQL - BLOBs. It was build form available parts. In addition OID type may be used and it's used as the system id or may be used as just some kind of row id - all types, tables, sequences etc has OID. You may create table with "WITH OIDS" clause. You may use OID data type just as replacement for (unsigned) int, so JDBC can create LOB but it can't decide if given field in row is reference to LOB or e.g. table and what with statements "DELETE WHERE date > "?

In fact JDBC driver is so "stupid" that if you will call getBytes or getBlob on any column with Oid it will ask for LOB.

Hibernate knows this and it does what is best - calls standard BLOB interface, and creates table with Oid column.

And here again round trip, in case of deletion only Hibernate may delete given LOB because only Hibernate and You knows that any value in Oid column will reflect LOB - JDBC driver doesn't "knows " this, but... oids may be shared because those are only numbers, not all tables may be covered by Hibernate, and assuming huge imagination, someone may encode OID by adding 1, may store it as long, etc. I know it's quite popular that DB schema comes from entities, but not always You have such flexibility and sometimes You create Entities for particular schema.

So, as You see only this trigger approach is currently (universally) the best way. If this is that module I think about it's just trigger which calls unlink on replace or deletion - in fact You may write own without any problems. Those are few lines only.

5. If you want to use bytea use
@Type(type="org.hibernate.type.PrimitiveByteArrayBlobType") (I think you
should remove @Lob too) on your field.

In fact, this annotation syntax usage looks like the Hibernate mapping
for PostgreSQL could be enhanced.
You have right, but Hibernate team will be in need to add auto deletion for bulk updates, too. PostgreSQL isn't still so popular it's worth of it (and they have no so much developers bug report may stand for months without any comment). Look how many peoples ask for Lobs. It means no one needs true LOB - true LOB large object stored outside table/row data space, to allow out of statements operation like streaming, partial updates etc. This is my definition of LOB, because this is idea of LOB. I think only DB2 stores LOBs in row, PostgreSQL do not make this but adverts this bytea ugly approach.

Personally, I create app which captures images from WebCam (like You) - just frames not movies. From above reason I wanted to move to bytea, but due to changes (wired instability and leak of backward compatibility) I still have Oid. Because I have only two tables for Lobs I have garbage collection simple script.

Best regards
Radek

Yours, Stefan

2012/1/8 Radosław Smogura <rsmogura@xxxxxxxxxxxxxxx>:
On Sun, 8 Jan 2012 18:08:09 +0100, Stefan Keller wrote:

I'd like to backup my statement below regarding in JDBC driver from
PostgreSQL:

When storing fields of type BLOB it inserts the binary string in
system table pg_largeobject (via user table). But when rows in user
table get updated or deleted it does not update nor delete
corresponding rows in table pg_largeobject.

That's really a bug!

Fortunately there's a solution indicated in the official docs
(http://www.postgresql.org/docs/9.1/static/lo.html)
To me, something like this should be implemented before hand in JDBC
driver.
And in any case there should be a bold note about this in the JDBC
docs (http://jdbc.postgresql.org/documentation/head/binary-data.html )

Yours, Stefan



2012/1/6 Stefan Keller <sfkeller@xxxxxxxxx>:

Hi,

I run into a nasty behavior of current PostgreSQL JDBC.

I maintain images (from Webcams). In the Java and Hibernate (JPA) code I specified a @Lob annotation on class MyData and a attribte/data type "byte[] mydata;". Hibernate then generates two tables in PostgreSQL, one called MyData with a column mydata of type oid and an internal one called pg_largobjects (which contain foreign keys to the oid). That's also explained in the JDBC docs [1], saying "PostgreSQL provides two
distinct ways to store binary data. Binary data can be stored in a
table using the data type bytea or by using the Large Object feature which stores the binary data in a separate table in a special format
and refers to that table by storing a value of type oid in your
table."

Now, when replacing the images (few hundred) with new ones using Java,
pg_largeobjects grew constantly until the file system run out of
space. So old image data did'nt get released! This is to me a bug
because the user/programmer must (and should) assume that there is a
strict 1:1 relationship between generated table MyData and its LOB
column data (stored in pg_largeobjects).
=> I finally found the supplied module 'lo' [2] which releases
detached records. Is this the recommended way to resolve this problem?

Searching for explanations I found a ticket HHH2244 [3] which was
closed by the Hibernate team without action referring to the JDBC
Spec. which says: "An SQL BLOB is a built-in type that stores a Binary
Large Object as a column value in a row of a database table".
=> In other words: The PostgreSQL JDBC team should take action on this
but didn't until now, right?

There is another issue about "PostgreSQL and BLOBs" [4]. First it
cites PostgreSQL JDBC docs [1]. The thread [4] ends somehow too in a 'deadlock' concluding  "...the PostgreSQLDialect (as of 3.5.5) needs to change not to use MaterializedBlobType until the Postgres (JDBC) team changes their driver (which does not seem to have happened in the
last 6 years)."
=> Any solutions or comments form insiders on this?

Yours, Stefan

[1] http://jdbc.postgresql.org/documentation/head/binary-data.html
[2] http://www.postgresql.org/docs/current/interactive/lo.html
[3] https://hibernate.onjira.com/browse/HHH-2244
[4] http://relation.to/Bloggers/PostgreSQLAndBLOBs

This is common approach for PostgreSQL and some proxy of real life Blobs.
One time I submitted bug about this to Hibernate. But step by step.

1. BLOBs are... Large Objects, they are stored as reference because those objects are large, if you will store this objects as bytea then select * will return all large data. It may not be comfortable not only to download few GB of data, but to keep this on stack too. From your perspective it doesn't matters because you put it in byte[]. But if You will keep e.g. CD-ROM images then it's much more better to use streaming approach then bytea[]. More over due to some security JDBC driver will at least double
memory consumed by bytea.

2. Specifying hibernate data type as bytea do not resolve problems because
it will still use LOB approach.

3. pg_largeobjects is system table, hibernate do not creates it.

4. Trigger based approach is good for this, but You need to do this mannualy

5. If you want to use bytea use
@Type(type="org.hibernate.type.PrimitiveByteArrayBlobType") (I think you
should remove @Lob too) on your field.

Regards,
Radosław Smogura


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux