The patch titled Subject: xarray: inline xas_descend to improve performance has been added to the -mm mm-unstable branch. Its filename is xarray-inline-xas_descend-to-improve-performance.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/xarray-inline-xas_descend-to-improve-performance.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Long Li <leo.lilong@xxxxxxxxxx> Subject: xarray: inline xas_descend to improve performance Date: Tue, 16 Apr 2024 14:16:28 +0800 The commit 63b1898fffcd ("XArray: Disallow sibling entries of nodes") modified the xas_descend function in such a way that it was no longer being compiled as an inline function, because it increased the size of xas_descend(), and the compiler no longer optimizes it as inline. This had a negative impact on performance, xas_descend is called frequently to traverse downwards in the xarray tree, making it a hot function. Inlining xas_descend has been shown to significantly improve performance by approximately 4.95% in the iozone write test. Machine: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz #iozone i 0 -i 1 -s 64g -r 16m -f /test/tmptest Before this patch: kB reclen write rewrite read reread 67108864 16384 2230080 3637689 6315197 5496027 After this patch: kB reclen write rewrite read reread 67108864 16384 2340360 3666175 6272401 5460782 Percentage change: 4.95% 0.78% -0.68% -0.64% This patch introduces inlining to the xas_descend function. While this change increases the size of lib/xarray.o, the performance gains in critical workloads make this an acceptable trade-off. Size comparison before and after patch: text .data .bss file 0x3502 0 0 lib/xarray.o.before 0x3602 0 0 lib/xarray.o.after Link: https://lkml.kernel.org/r/20240416061628.3768901-1-leo.lilong@xxxxxxxxxx Signed-off-by: Long Li <leo.lilong@xxxxxxxxxx> Cc: Hou Tao <houtao1@xxxxxxxxxx> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> Cc: yangerkun <yangerkun@xxxxxxxxxx> Cc: Zhang Yi <yi.zhang@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- lib/xarray.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/lib/xarray.c~xarray-inline-xas_descend-to-improve-performance +++ a/lib/xarray.c @@ -200,7 +200,8 @@ static void *xas_start(struct xa_state * return entry; } -static void *xas_descend(struct xa_state *xas, struct xa_node *node) +static __always_inline void *xas_descend(struct xa_state *xas, + struct xa_node *node) { unsigned int offset = get_offset(xas->xa_index, node); void *entry = xa_entry(xas->xa, node, offset); _ Patches currently in -mm which might be from leo.lilong@xxxxxxxxxx are xarray-inline-xas_descend-to-improve-performance.patch