Re: Choosing and tuning Linux file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 25, 2006 at 03:00:53PM -0700, Valerie Henson wrote:
> I foolishly signed up to give a talk at OSCON in about a month about
> choosing and tuning Linux file systems for different workloads.  I
> have some ideas about which file system to use when, but I'd rather
> get recommendations from the experts on each file system.  Below is a
> straw man outline of my current recommendations, please take a look
> and comment.  I will make a summary freely available when I'm done.
> At long last, I'll have an easy answer when someone asks me, "But
> which file system should I use?"  Answer: "Go read this web page..."

Here are some comments.

> Choosing a file system
> 
> Laptop: ext3 with noatime
> General purpose server: ext3 or reiser
> Lots of small files: reiser, ext2/3 with 1k blocks

Small files usually implies lots of files in a directory, so be sure to
use htree with ext3.

> More than ~32,000 files in one directory: XFS or reiser

Ext3 can easily have more than 32000 *files* in a directory. However,
it can only have 32000 *subdirectories* in a directory. This limit is
from struct ext3_inode->i_links_count, which is an __le16: each
subdirectory has an entry ".." that links back to its parent increasing
the parents i_links_count.

> Fast lookups in large directories: XFS, reiser, ext3 with htree (?)
> File size more than 2TB: XFS, reiser up to 8TB
> File system size more than 2TB: XFS, reiser up to 16TB
> Ease of data recovery after corruption: ext2, ext3
> 
> Tuning a file system
> 
> Use "noatime" mount option

Can also be combined with the "nodiratime" mount option.

>  - atime makes read workloads into random write workloads, yuck
>  - This is Ubuntu installation default
>  - I have a report that mutt doesn't work with this because atime is
>    never updated but mtime is, maybe some kind of lazy atime is better?

It does indeed think that a mailbox always has new content. However,
this is only with mbox style mailboxes, maildir or mh style mailboxes
just work.

>  - Don't do if you want to e.g., track down hackers
> 
> Choosing journaling mode in ext3
>  - Default is "ordered", usually the right choice
>  - "journal" is slower but guarantees data is on-disk as well
>  - "writeback" is faster but may result in garbage/security leaks in
>    your file data
> 
> Choosing block size
>  - You can do this at mkfs time
>  - tradeoff is space wasted vs. max file/fs size (other considerations?)
>  - limitation is system page size

NTFS has support for block sizes larger than page size. There were some
patches from Anton Altaparmakov to allow such block sizes, but IIRC
they are NTFS-only and not made genericly available for all
filesystems.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux