Hi,
4.9 (and earlier) LTS kernels are missing this:
commit ec00022030da5761518476096626338bd67df57a
Author: Tahsin Erdogan <tahsin@xxxxxxxxxx>
Date: Sat Aug 5 22:41:42 2017 -0400
ext4: inplace xattr block update fails to deduplicate blocks
OK to backport it?
I tested it briefly in 4.9, seems to work.
One of our testers noticed a glusterfs performance regression when going
from 4.4 to 4.9, caused by the duplicated blocks.
In I understand everything correctly, in 4.4 mbcache uses the block
number in the hash table bucket calculation, and the hash table is
populated quite evenly even if there are duplicates. So the mbcache is fast.
But in later kernels mbcache puts all the duplicate entries into a
single bucket. As the entries are stored in one big linked list, this
obviously makes the mbcache slow.
I tested this in 4.9 (which still has the ext4_xattr_rehash() call that
got eliminated in commit "ext4: eliminate xattr entry e_hash
recalculation for removes"):
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 3eeed8f0aa06..3fadfabcac39 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -837,8 +837,6 @@ ext4_xattr_block_set(handle_t *handle, struct inode
*inode,
if (!IS_LAST_ENTRY(s->first))
ext4_xattr_rehash(header(s->base),
s->here);
- ext4_xattr_cache_insert(ext4_mb_cache,
- bs->bh);
}
ext4_xattr_block_csum_set(inode, bs->bh);
unlock_buffer(bs->bh);
@@ -959,6 +957,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode
*inode,
} else if (bs->bh && s->base == bs->bh->b_data) {
/* We were modifying this block in-place. */
ea_bdebug(bs->bh, "keeping this block");
+ ext4_xattr_cache_insert(ext4_mb_cache, bs->bh);
new_bh = bs->bh;
get_bh(new_bh);
} else {
Tommi