Rod wrote:
Hello, I have a web application where users upload/share files. After file is uploaded it is copied to S3 and all subsequent downloads are done from there. So in a file's lifetime it's accessed only twice- when created and when copied to S3. Files are documents, of different size from few kilobytes to 200 Megabytes. Number of files: thousands to hundreds of thousands. My dilemma is - Should I store files in PGSQL database or store in filesystem and keep only metadata in database? I see the possible cons of using PGSQL as storage: - more network bandwidth required comparing to access NFS-mounted filesystem ? - if database becomes corrupt you can't recover individual files - you can't backup live database unless you install complicated replication add-ons - more CPU required to store/retrieve files (comparing to filesystem access) - size overhead, e.g. storing 1000 bytes will take 1000 bytes in database + 100 bytes for db metadata, index, etc. with lot of files this will be a lot of overhead. Are these concerns valid? Anyone had this kind of design problem and how did you solve it?
S3 storage is not suitable for running a RDBMS. An RDBMS wants fast low latency storage using 8k block random reads and writes. S3 is high latency and oriented towards streaming
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general