Re: ZFS for Cyrus IMAP storage

Rudy Gevaert <Rudy.Gevaert@xxxxxxxx> · Sat, 05 May 2007 23:20:32 +0200

Vincent Fox wrote:

I originally brought up the ZFS question.

We seem to have arrived at a similar solution after much 
experimentation.  Meaning using ZFS for the things it does well already, 
and  leveraging proven hardware to fill in the weak spots.

I have a pair of Sun 3510FC arrays we have exported 2 RAID5 LUNs (5 
disks each) with one on primary and the other on second controller. This 
is to exploit the active-active controller feature of the 3510FC.  We 
are also doing multipathing through a pair of SAN fabric switches.

On top of that we then use ZFS to join a LUN from each array into a 
mirror pair, and then add the other pair as well. I guess you could call 
it RAID 5+1+0.  This architecture allows us to add more storage to the 
pool while online, by adding more 3510FC array pairs to the pool.

Performance in benchmarking (Bonnie++ etc.) has shown to be little 
different from turning them into JBOD and doing everything with ZFS.   
Behavior is more predictable to me since I know that the 3510 firmware 
knows how to rebuild a RAID5 set using the assigned spare drive in that 
array.  With ZFS I see no way to specify which disk is assigned as spare 
to a particular set of disks, which could mean a spare is pulled from 
another array.

It's pretty nifty to be able to walk into the machine room and flip off 
the power to an entire array and things keep working without a blip.  
It's not the most efficient usage of disk space but with performance & 
safety this promising for an EMAIL SERVER it will definitely be 
welcome.  I dread the idea of silent data corruption or long fsck time 
on a 1+ TB mail spool which ZFS should save us from.  I have atime=off 
and compression=on.  Our setup is slightly different from yours in that 
we are clustering 2  T2000 with 8GB RAM each, and we are currently 
setting up Solaris Cluster 3.2 software in failover configuration so we 
can patch without downtime.

Thanks for the idea about daily snapshots for recovering recent data, I 
like that idea a lot.  I'll tinker around with it I wonder if there'd be 
much penalty to upping the snapshots to every 8 hours.  Depends on how 
much churn there is in your mail spool I suppose.

This system should move into production later this month.  We have 
70,000 accounts that we'll begin a long and slow migration from our 
UW-IMAP pool of servers.  We have an existing Perdition proxy server 
setup, which will allow us to migrate users transparently.  Hopefully 
I'll have more good things to say about it sometime thereafter.

Are you going to do this with "1" perdition server?  Make sure you have 
compiled perdition with /dev/urandom, or an other sort of non blocking 
entropy providing device :)

Rudy

----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html