On 17 December 2014 at 13:55, Thomas Kellerer <spam_eater@xxxxxxx> wrote: > Albe Laurenz schrieb am 17.12.2014 um 11:07: >> and the performance will be worse than reading files from the file system. > > There is a Microsoft research [1] (from 2006) which tested this "myth" using SQL Server. > It showed that the database might actually be faster than the file system. > > As this topic comes up at my workplace every now and then as well, I created a little web application (Java/JDBC) to test this on Postgres and possibly other DBMS. > > Turns out the Postgres as well isn't really slower at this than the file system. > > For small files around 50k both perform similar: the average time to read the blob from a bytea column was around 2ms whereas the average time to read the blob from the filesystem was around 1ms. The test uses 50 threads to read the blobs using the PK of the table. > > "Reading from the filesystem" means looking up the path for the file in the database table and then reading the file from the filesystem. With how many blobs/files did you test this? I'm asking because PG stores all blobs in a single table. On a file-system, if all files are stored in a single directory, the situation is similar. However, a file-system has the ability to store files in several directories instead of just one, which is often claimed to improve file-locating performance. Seeing as the read performance of a file (once it's been located) from the file-system versus a blob appears similar, the difference in time for locating the file might well be relevant here. Interesting to see this was tested with MS SQL and therefore limited to NTFS. It's probably useful to test this with other file-systems, such as ZFS or UFS (with DIRHASH!), etc. Regards, Alban Hertroys -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general