[PATCH 0/3 v2] ext4: Speedup orphan file handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Hello,

This is the second version of my patches to speed up orphan inode handling
in ext4.

Orphan inode handling in ext4 is a bottleneck for workloads which heavily
excercise truncate / unlink of small files as they contend on global
s_orphan_mutex (when you have fast enough storage). This patch set implements
new way of handling orphan inodes - instead of using a linked list, we store
inode numbers of orphaned inodes in a file which is possible to implement in a
more scalable manner than linked list manipulations. See description of patch
2/3 for more details.

The patch set achieves significant gains both for a micro benchmark stressing
orphan inode handling (truncating file byte-by-byte, several threads in
parallel) and for reaim new_fserver workload. As a highlight, microbenchmark
runtime for 128 threads is reduced from original 160 s down to 71 s, which
is also the time it takes the benchmark to run when orphan inode handling
is completely disabled. For full numbers you can check commit logs of
patches 2/3 and 3/3. You can also check my presentation from Vault at
http://events.linuxfoundation.org/sites/events/files/slides/ext4-scaling.pdf
for graphs from tests.

I'm happy for any review, thoughts, ideas about the patches.

The kernel part of the feature is complete, I have also implemented full
support in e2fsprogs. That still needs some debugging (especially the e2fsck
part) but support in mke2fs or tune2fs is fine. I'll post these as a separate
patch so that people can try this out.

For now I'm using inode 9 for orphan file. I know that is reserved as
EXT2_EXCLUDE_INO but at least for the sake of testing that should be fine.

								Honza

Changes since v1:
* orphan blocks have now magic numbers
* split out orphan handling to a separate source file
* some smaller updates according to review
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux