Re: Choosing and tuning Linux file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/25/06, Valerie Henson <val_henson@xxxxxxxxxxxxxxx> wrote:
Hi folks,

I foolishly signed up to give a talk at OSCON in about a month about
choosing and tuning Linux file systems for different workloads.  I
have some ideas about which file system to use when, but I'd rather
get recommendations from the experts on each file system.  Below is a
straw man outline of my current recommendations, please take a look
and comment.  I will make a summary freely available when I'm done.
At long last, I'll have an easy answer when someone asks me, "But
which file system should I use?"  Answer: "Go read this web page..."

heh, in other words, "bring on the flames, FUD, death threats, etc"

By the way, a lot of the data on file/fs limits and the like is from:

http://en.wikipedia.org/wiki/File_systems

If it's wrong, please go check the page and update it if it's wrong.
Thanks!

Choosing a file system

Laptop: ext3 with noatime
General purpose server: ext3 or reiser
Lots of small files: reiser, ext2/3 with 1k blocks
More than ~32,000 files in one directory: XFS or reiser
Fast lookups in large directories: XFS, reiser, ext3 with htree (?)
File size more than 2TB: XFS, reiser up to 8TB
File system size more than 2TB: XFS, reiser up to 16TB
Ease of data recovery after corruption: ext2, ext3

Tuning a file system

Use "noatime" mount option
 - atime makes read workloads into random write workloads, yuck
 - This is Ubuntu installation default
 - I have a report that mutt doesn't work with this because atime is
   never updated but mtime is, maybe some kind of lazy atime is better?
 - Don't do if you want to e.g., track down hackers

Choosing journaling mode in ext3
 - Default is "ordered", usually the right choice
 - "journal" is slower but guarantees data is on-disk as well
 - "writeback" is faster but may result in garbage/security leaks in
   your file data

XFS (and reiser4) use delayed block allocation, and have no
"data=journal" option, however reiser4 guarantees "data=ordered".
delayed allocation can have a big performance advantage for
interspersed writes and overwrites.

Choosing block size
 - You can do this at mkfs time
 - tradeoff is space wasted vs. max file/fs size (other considerations?)
 - limitation is system page size

you might also want to mention ext3 reservations, they can definitely
increase performance for streaming workloads, and can be increased by
changing a #define.  too bad this sort of thing isn't generalized for
all the FS's, with some sort of pre-allocation/mapping addition to the
aops.  it could even replace the bmap() call.

Tuning reiser
 - I know nothing!!!  Help!

read up on the notail option, it is almost always the best idea.  it
reduces the number of seeks, at the cost of a small packing
inefficiency.  also, reiser4 fixes this problem (and some other big
performance issues).

NATE
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux