On Wed, Oct 19, 2022 at 5:05 PM Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote: > On Wed, 2022-10-19 at 12:48 +0200, Dominique Devienne wrote: > > On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh <andreas@xxxxxxxxxx> wrote: > > > First advice, don't do it. We started off storing blobs in DB for “TX safety” > > Not really an option, I'm afraid. > You should reconsider. Ruling out that option now might get you into trouble > later. Large Objects mean trouble. Andreas, Ericson, Laurenz, thanks for the advice. I'll be sure to discuss these concerns with the team. We have other (bigger) data in the file system, albeit more of a read-only nature though perhaps. And this is an area I'm not familiar with how security is handled, so I'll investigate it to see if a path forward to externalize the largish blobs (currently destined to live in the DB) is possible. So I hope you can see I'm not dismissing what you guys are saying. But before I finish this thread for now, I'd like to add that I consider unfortunate a state of affairs where NOT putting the data in the DB is the mostly agreed upon advice. It IMHO points to a weak point of PostgreSQL, which does not invest in those use-cases with large data, perhaps with more file-system like techniques. Probably because most of the large users of PostgreSQL are more on the "business" side (numerous data, but on the smaller sizes) than the "scientific" side, which (too often) uses files and files-in-a-file formats like HDF5. FWIW, when Oracle introduced SecureFile blobs years ago in v11, it represented a leap forward in performance, and back then we were seeing them being 3x faster than LO at GB sizes, if I recall correctly, with throughput that challenged regular networked file-system like NFS. That was over 10 years ago, so who knows where we are now. And from the posts here, the issues with large blobs may be more related to backup/restore perhaps, than runtime performance. Having all the data in the DB, under a single security model, is a big win for consistency and simplicity. And the fact it's not really possible now is a pity, in my mind. My (probably uninformed) opinion on this is the large blobs are handled just like other relational data, in paged storage designed for smaller data. I.e. file-like blobs are shoehorned into structures which are inappropriate for them, and that a rethink and redesign is necessary specifically for them, similar to the Oracle SecureFile one of old. I have similar gripes with SQLite, which is otherwise a fantastic embedded DB. Just see how the SQLite-based Fossil-SCM fails to scale for very large repo with big (e.g. game) assets, and how it similarly failed to scale in SVN a long time ago, to be replaced by a forest-of-files (which GIT also uses). DBs like PostgreSQL and SQLite should be better at this. And I hope they get there eventually. Sorry to turn a bit philosophical at this. It's not a critic per-se. More of the personal musing of a dev in this space for a long time. FWIW. Thanks, --DD