On 4/4/07, Alexandre Vasconcelos <alex.vasconcelos@xxxxxxxxx> wrote:
We have an application subjected do sign documents and store them somewhere. The files size may vary from Kb to Mb. Delelopers are arguing about the reasons to store files direcly on operating system file system or on the database, as large objects. My boss is considering file system storing, because he is concerned about integrity, backup/restore corruptions. I'd like to know some reasons to convince them to store these files on PosgtreSQL, including integrity, and of course, performance. I would like to know the file system storing disadvantages as well.
This topic actually gets debated about once a month on the lists :-). Check the archives, but here is a quick summary: Storing objects on the file system: * usually indexed on the database for searching * faster than database (usually) * more typical usage pattern * requires extra engineering if you want to store huge numbers of objects * requires extra engineering to keep your database in sync. on postgresql irc someone suggested a clever solution with inotify * backup can be a pain (even rsync has its limits) -- for really big systems, look at clustering solutions (drbd for example) * lots of people will tell you this 'feels' right or wrong -- ignore them :-) * well traveled path. it can be made to work. Storing objects on the database: * slower, but getting faster -- its mostly cpu bound currently * get very recent cpu. core2 xeons appear to be particularly good at this. * use bytea, not large objects * will punish you if your client interface does not communicate with database in binary * less engineering in the sense you are not maintaining two separate systems * forget backing up with pg_dump...go right to pitr (maybe slony?) * 1gb limit. be aware of high memory requirements * you get to work with all your data with single interface and administrate one system -- thats the big payoff. * less well traveled path. put your r&d cap on and be optimistic but skeptical. do some tests. merlin