Re: Put only cyrus.index and cyrus.cache in metadata partitions ?

Bron Gondwana <brong@xxxxxxxxxxx> · Tue, 31 Aug 2010 08:01:17 +1000

On Mon, Aug 30, 2010 at 07:00:39PM +0200, Eric Luyten wrote:
> I just ran some tests on an empty test configuration and obtained surprising
> results with different data/metadata split situations.
> 
> 
> Server : SunFire X4170 with 72 GB of RAM, 2 x 300 GB internal disks in ZFS
> mirror configuration, 32 GB "LogZilla" SSD, 2 x 1 Gbps iSCSI SAN connection
> with multisession failover and load balancing capabilities.
> 
> 
> Test : start four parallel streams invoking /usr/cyrus/bin/deliver on a 50 KB
> message and send it 5000 (five thousand) times to a different mailbox.
> 
> Metadata merged in data (on iSCSI SAN) : 114 seconds elapsed time, 175 msgs/s
> Metadata on SAS disks                  : 200 seconds elapsed time, 100 msgs/s
> Metadata on SSD                        :  97 seconds elapsed time, 206 msgs/s
> 
> The ZFS 'atime' attribute was 'off' on all devices.
> 
> The SSD is also used for the Cyrus DB's (mailboxes.db, deliver.db) and 'proc'
> directory.

Symlink the 'proc' directory out into a tmpfs/ramfs.   It doesn't need to
be on real storage.

> Dtrace shows that the Cyrus metadata files attract many synchronous writes.

Yes, yes they do.  We 'fsync' after each mailbox write - specifically both
to the cyrus.index and cyrus.cache.  We also exclusively lock (fcntl or
flock) the cyrus.index to keep consistency during the append.

Now the iSCSI data is interesting - but the question is IOPS - what sort of
random write IO can your SAN sustain?  Particulary - does it have battery
backed write cache?  Remember that you only get one disk's worth of IO for
random writes on a RAID1, because it can't return until it's in solid
storage on both pieces of spinning rust.

Bron.
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html