Re: Storing Video's or vedio file in DB.

Thomas Kellerer <spam_eater@xxxxxxx> · Wed, 17 Dec 2014 13:55:46 +0100

Albe Laurenz schrieb am 17.12.2014 um 11:07:
> and the performance will be worse than reading files from the file system.

There is a Microsoft research [1] (from 2006) which tested this "myth" using SQL Server. 
It showed that the database might actually be faster than the file system. 

As this topic comes up at my workplace every now and then as well, I created a little web application (Java/JDBC) to test this on Postgres and possibly other DBMS. 

Turns out the Postgres as well isn't really slower at this than the file system.

For small files around 50k both perform similar: the average time to read the blob from a bytea column was around 2ms whereas the average time to read the blob from the filesystem was around 1ms. The test uses 50 threads to read the blobs using the PK of the table. 

"Reading from the filesystem" means looking up the path for the file in the database table and then reading the file from the filesystem. 

For larger files around 250k Postgres was actually faster in my tests: 130ms reading the bytea column vs. 260ms reading the file from disk.

The tests were done locally on my Windows laptop. 
I didn't have time yet to do this on a Linux server. I expect the filesystem to have some impact on the figures and NTFS is not known for being blazingly fast. So maybe those figures will change.

My tests however do not take into account the actual time it takes to send the binary data from the server to the client (=browser). It might well be possible that serving the file through an Apache Web Server directly is faster than serving the file through a JEE Servlet. My intention was to measure the raw read speed of the binary data from the medium where it is stored. 

> The downside is that you might end up with a huge database 
> that you will have to backup and maintain

I don't really buy the argument with the backup: the amount of data to be backed up is essentially the same.
With both solutions you can have incremental backups.

Another downside you didn't mentioned is the fact that you have to distribute the files in the filesystem properly.
Having thousands or even millions of files in a single directory is not going to be maintenance friendly either.

Regards
Thomas

[1] http://research.microsoft.com/apps/pubs/default.aspx?id=64525

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general