On 7/7/10, Jeff Darcy <jdarcy at redhat.com> wrote: > A bunch of ext4/xfs/etc. maintainers are in my group. The "party line" > is that ext4 can be made suitable for production use *if* you have all > of the latest patches (not just ext4 itself but block layer etc.) *and* > set the right options. IIRC the default versions and options shipped > with most distributions - including older versions of RHEL and Fedora - > are probably not the ones you want. The downside is that if you want > greater data safety you pay for it in performance, and some of the > performance regressions associated with switching to safer defaults have Data safety is paramount from my POV. Clients usually can accept network downtime and hardware failures. But corrupted/loss of data is usually unacceptable. However, that said, they won't accept it if the new "expensive" setup is slower than the existing or cannot handle at least twice as much usage. Normally I wouldn't care much about performance impact, +/- 20% usually doesn't make a difference to the kind of loads I usually see on the client servers. However, since I'm already virtualizing, that's probably already 10~15% hit especially on IO side (hoping the Intel VT-d would negate that), then increased latency since storage is on network, hence I'm starting to be a bit concerned if I'm going to end up with say 20% of local file performance. > been widely discussed. If XFS is an option for you, it might be worth > considering because it balances these safety and performance needs a > little better. Otherwise, I'd recommend careful research and > configuration of ext4, because these are the kinds of problems you > probably won't catch in a synthetic testing environment and you really > don't want to be debugging data-integrity problems just after the Big > Power Hit. I read up on XFS after your recommendation and it seems that ext4 is based on xfs, which also has that delay allocation feature and zero-file problem. Yet more reading says they sorta fixed that in xfs and that ext3 actually does the same thing, just that it has a much shorter 5 sec interval to flush so lose less data and no zero-out file even if crashed. I can't seem to find how long it takes XFS to flush, some xfs documentation giving options says default meta data flush is 3.5 sec but no clue about actual file data? The sad thing is, the more I read up, the more worried I get and I haven't got around to asking questions about fencing as well as performance impact between many small files (exim in maildir) and updates to single big file (mysql/postgres db). Don't know if I'm worrying more than I should, I was sleeping easier before knowing ext3 delays too :D