On 6/25/06, Valerie Henson <val_henson@xxxxxxxxxxxxxxx> wrote:
Hi folks, I foolishly signed up to give a talk at OSCON in about a month about choosing and tuning Linux file systems for different workloads. I have some ideas about which file system to use when, but I'd rather get recommendations from the experts on each file system. Below is a straw man outline of my current recommendations, please take a look and comment. I will make a summary freely available when I'm done. At long last, I'll have an easy answer when someone asks me, "But which file system should I use?" Answer: "Go read this web page..."
heh, in other words, "bring on the flames, FUD, death threats, etc"
By the way, a lot of the data on file/fs limits and the like is from: http://en.wikipedia.org/wiki/File_systems If it's wrong, please go check the page and update it if it's wrong. Thanks! Choosing a file system Laptop: ext3 with noatime General purpose server: ext3 or reiser Lots of small files: reiser, ext2/3 with 1k blocks More than ~32,000 files in one directory: XFS or reiser Fast lookups in large directories: XFS, reiser, ext3 with htree (?) File size more than 2TB: XFS, reiser up to 8TB File system size more than 2TB: XFS, reiser up to 16TB Ease of data recovery after corruption: ext2, ext3 Tuning a file system Use "noatime" mount option - atime makes read workloads into random write workloads, yuck - This is Ubuntu installation default - I have a report that mutt doesn't work with this because atime is never updated but mtime is, maybe some kind of lazy atime is better? - Don't do if you want to e.g., track down hackers Choosing journaling mode in ext3 - Default is "ordered", usually the right choice - "journal" is slower but guarantees data is on-disk as well - "writeback" is faster but may result in garbage/security leaks in your file data
XFS (and reiser4) use delayed block allocation, and have no "data=journal" option, however reiser4 guarantees "data=ordered". delayed allocation can have a big performance advantage for interspersed writes and overwrites.
Choosing block size - You can do this at mkfs time - tradeoff is space wasted vs. max file/fs size (other considerations?) - limitation is system page size
you might also want to mention ext3 reservations, they can definitely increase performance for streaming workloads, and can be increased by changing a #define. too bad this sort of thing isn't generalized for all the FS's, with some sort of pre-allocation/mapping addition to the aops. it could even replace the bmap() call.
Tuning reiser - I know nothing!!! Help!
read up on the notail option, it is almost always the best idea. it reduces the number of seeks, at the cost of a small packing inefficiency. also, reiser4 fixes this problem (and some other big performance issues). NATE - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html