Search Postgresql Archives

Re: Filesystem vs. Postgres for images

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Apr 13, 2004, at 11:27 AM, scott.marlowe wrote:


On Tue, 13 Apr 2004, Christopher Petrilli wrote:

2. Retrieval time is limited not by disk bandwidth, but by I/O seek
performance. More spindles = more concurrent I/O in flight. Also, this
is where SCSI takes a massive lead with tag-command-queuing.

In our case, we ended up using a three-tier directory structure, so
that we could manage the number of files per directory, and then
because load was relatively even across the top 20 "directories", we
split them onto 5 spindle-pairs (i.e. RAID-1).  This is a place where
RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is
a substantial performance increase.

Please explain why RAID 5 is so bad here. I would think that on a not
very heavily updated fs, RAID-5 would be the functional equivalent of a
RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a bad
idea (other than the unreliability of it) because it only puts the data on
one spindle, unlike RAID-1 which puts it on many.


In that case >2 drive RAID 1 setups might be a huge win. The linux kernel
certainly supports them, and I think some RAID cards do too.

The issue comes down to read and write strategies. If your files are bigger than the stripe size and begin to involve multiple drives, then the rotational latency of each drive can come into play. This is often hidden under caching during those wonderful comparison reviews, but when you're talking about near random distributed access of more information than could fit in the cache, then you have to face the rotational issues of drives. Since the spindles are not locked together, they drift apart in location, and you often end up with worst-case latency in the drive subsystem. Mirroring doesn't face this, especially when you can distribute the READS across all the drives.


For example, if you ran triplex RAID-0, meaning 3 copies of the data, which is often done in large environments so that you can take one copy offline for a backup, while maintaining 2 copies online, then you can basically handle 3 reads for the cost of 1, increasing the number of read ops you can handle. This doesn't work with RAID-0, or RAID-5.

Chris
--
| Christopher Petrilli
| petrilli (at) amber.org


---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux