Vincent Fox wrote:
I originally brought up the ZFS question.
We seem to have arrived at a similar solution after much
experimentation. Meaning using ZFS for the things it does well already,
and leveraging proven hardware to fill in the weak spots.
I have a pair of Sun 3510FC arrays we have exported 2 RAID5 LUNs (5
disks each) with one on primary and the other on second controller. This
is to exploit the active-active controller feature of the 3510FC. We
are also doing multipathing through a pair of SAN fabric switches.
On top of that we then use ZFS to join a LUN from each array into a
mirror pair, and then add the other pair as well. I guess you could call
it RAID 5+1+0. This architecture allows us to add more storage to the
pool while online, by adding more 3510FC array pairs to the pool.
Performance in benchmarking (Bonnie++ etc.) has shown to be little
different from turning them into JBOD and doing everything with ZFS.
Behavior is more predictable to me since I know that the 3510 firmware
knows how to rebuild a RAID5 set using the assigned spare drive in that
array. With ZFS I see no way to specify which disk is assigned as spare
to a particular set of disks, which could mean a spare is pulled from
another array.
It's pretty nifty to be able to walk into the machine room and flip off
the power to an entire array and things keep working without a blip.
It's not the most efficient usage of disk space but with performance &
safety this promising for an EMAIL SERVER it will definitely be
welcome. I dread the idea of silent data corruption or long fsck time
on a 1+ TB mail spool which ZFS should save us from. I have atime=off
and compression=on. Our setup is slightly different from yours in that
we are clustering 2 T2000 with 8GB RAM each, and we are currently
setting up Solaris Cluster 3.2 software in failover configuration so we
can patch without downtime.
Thanks for the idea about daily snapshots for recovering recent data, I
like that idea a lot. I'll tinker around with it I wonder if there'd be
much penalty to upping the snapshots to every 8 hours. Depends on how
much churn there is in your mail spool I suppose.
This system should move into production later this month. We have
70,000 accounts that we'll begin a long and slow migration from our
UW-IMAP pool of servers. We have an existing Perdition proxy server
setup, which will allow us to migrate users transparently. Hopefully
I'll have more good things to say about it sometime thereafter.
Are you going to do this with "1" perdition server? Make sure you have
compiled perdition with /dev/urandom, or an other sort of non blocking
entropy providing device :)
Rudy
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html