On Thu, Jan 08, 2009 at 05:20:00PM +0200, Janne Peltonen wrote: > If I'm still following after reading through all this discussion, > everyone who is actually using ReiserFS (v3) appears to be very content > with it, even with very large installations. Apparently the fact that > ReiserFS uses the BKL in places doesn't hurt performance too badly, even > with multi core systems? Another thing I don't recall being mentioned > was fragmentation - ext3 appears to have a problem with it, in typical > Cyrus usage, but how does ReiserFS compare to it? Yeah, I'm surprised the BKL hasn't hurt us more. Fragmentation, yeah it does hurt performance a bit. We run a patch which causes a skiplist checkpoint every time it runs a "recovery", which includes every restart. We also tune skiplists to checkpoint more frequently in everyday use. This helps reduce meta fragmentation. For data fragmentation - we don't care. Honestly. Data IO is so rare. The main time it matters is if someone does a body search. Which leaves... index files. The worst case are files that are only ever appended to, never any records deleted. Each time you expunge a mailbox (even with delayed expunge) it causes a complete rewrite of the cyrus.index file. I also wrote a filthy little script (attached) which can repack cyrus meta directories. I'm not 100% certain that it's problem free though, so I only run it on replicas. Besides, it's not "protected" like most of our auto-system functions, which check the database to see if the machine is reporting high load problems and choke themselves until the load drops back down again. > I'm using this happily, with 50k users, 24 distinct mailspools of 240G > each. Full backups take quite a while to complete (~2 days), but normal > usage is quite fast. There is the barrier problem, of course... I'm > using noatime (implying nodiratime) and data=ordered, since > data=writeback resulted in corrupted skiplist files on crash, while > data=ordered mostly didn't. Yeah, full backups. Ouch. I think the last time we had to do that it took somewhat over a week. Mainly CPU limited on the backup server, which is doing a LOT of gzipping! Our incremental backups take about 4 hours. We could probably speed this up a little more, but given that it's now down from about 12 hours two weeks ago, I'm happy. We were actually rate limited by Perl 'unpack' and hash creation, believe it or not! I wound up rewriting Cyrus::IndexFile to provide a raw interface, and unpacking just the fields that I needed. I also asserted index file version == 10 in the backup library so I can guarantee the offsets are correct. I've described our backup system here before - it's _VERY_ custom, based on a deep understanding of the Cyrus file structures. In this case it's definitely worth it - it allows us to reconstruct partial mailbox recoveries with flags intact. Unfortunately, "seen" information is much trickier. I've been tempted for a while to patch cyrus's seen support to store seen information for the user themselves in the cyrus.index file, and only seen information for unowned folders in the user.seen files. The way it works now seems optimised for the uncommon case at the expense of the common. That always annoys me! > Ext4 just got stable, so there is no real world Cyrus user experience on > it. Among other things, it contains an online defragmenter. Journal > checksumming might also help around the write barrier problem on LVM > logical volumes, if I've understood correctly. Yeah, it's interesting. Local fiddling suggests it's worse for my Maildir performance than even btrfs, and btrfs feels more jerky than reiser3, so I stick with reiser3. > Reiser4 might have a future, at least Andrew Morton's -mm patch contains > it and there are people developing it. But I don't know if it ever will > be included in the "standard" kernel tree. Yeah, the mailing list isn't massively active at the moment either... I do keep an eye on it. > Btrfs is in so early development that I don't know yet what to say about > it, but the fact of ZFS's being incompatible with GPL might be mitigated > by this. Yeah, btrfs looks interesting. Especially with their work on improving locking - even on my little dual processor laptop (yay core processors) I would expect to see an improvement when they merge the new locking code. > I'm going to continue using ext3 for now, and probably ext4 when it's > available from certain commercial enterprise linux vendor (personally, > I'd be using Debian, but the department has an official policy of using > RH / Centos). I'm eagerly waiting for btrfs to appear... I probably /would/ > switch to ReiserFS for now, if RH cluster would support ReiserFS FS > resources. Hmm, maybe I should just start hacking... On the other hand, > the upgrade path from ext3 to ext4 is quite easy, and I don't know yet > which would be better, ReiserFS or ext4. Sounds sane. If vendor support matters, then ext4 is probably the immediate future good choice. It's had a fair bit of work. I'm tempted to keep an eye on tux3 too. Exciting times in the linux filesystem world at the moment. Bron. ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html