On Mon, Aug 30, 2010 at 07:00:39PM +0200, Eric Luyten wrote: > I just ran some tests on an empty test configuration and obtained surprising > results with different data/metadata split situations. > > > Server : SunFire X4170 with 72 GB of RAM, 2 x 300 GB internal disks in ZFS > mirror configuration, 32 GB "LogZilla" SSD, 2 x 1 Gbps iSCSI SAN connection > with multisession failover and load balancing capabilities. > > > Test : start four parallel streams invoking /usr/cyrus/bin/deliver on a 50 KB > message and send it 5000 (five thousand) times to a different mailbox. > > Metadata merged in data (on iSCSI SAN) : 114 seconds elapsed time, 175 msgs/s > Metadata on SAS disks : 200 seconds elapsed time, 100 msgs/s > Metadata on SSD : 97 seconds elapsed time, 206 msgs/s > > The ZFS 'atime' attribute was 'off' on all devices. > > The SSD is also used for the Cyrus DB's (mailboxes.db, deliver.db) and 'proc' > directory. Symlink the 'proc' directory out into a tmpfs/ramfs. It doesn't need to be on real storage. > Dtrace shows that the Cyrus metadata files attract many synchronous writes. Yes, yes they do. We 'fsync' after each mailbox write - specifically both to the cyrus.index and cyrus.cache. We also exclusively lock (fcntl or flock) the cyrus.index to keep consistency during the append. Now the iSCSI data is interesting - but the question is IOPS - what sort of random write IO can your SAN sustain? Particulary - does it have battery backed write cache? Remember that you only get one disk's worth of IO for random writes on a RAID1, because it can't return until it's in solid storage on both pieces of spinning rust. Bron. ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html