[patch 133/140] radix tree: use GFP_ZONEMASK bits of gfp_t for flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>
Subject: radix tree: use GFP_ZONEMASK bits of gfp_t for flags

Patch series "XArray", v9.  (First part thereof).

This patchset is, I believe, appropriate for merging for 4.17.  It
contains the XArray implementation, to eventually replace the radix tree,
and converts the page cache to use it.

This conversion keeps the radix tree and XArray data structures in sync at
all times.  That allows us to convert the page cache one function at a
time and should allow for easier bisection.  Other than renaming some
elements of the structures, the data structures are fundamentally
unchanged; a radix tree walk and an XArray walk will touch the same number
of cachelines.  I have changes planned to the XArray data structure, but
those will happen in future patches.

Improvements the XArray has over the radix tree:

 - The radix tree provides operations like other trees do; 'insert' and
   'delete'.  But what most users really want is an automatically resizing
   array, and so it makes more sense to give users an API that is like
   an array -- 'load' and 'store'.  We still have an 'insert' operation
   for users that really want that semantic.
 - The XArray considers locking as part of its API.  This simplifies a lot
   of users who formerly had to manage their own locking just for the
   radix tree.  It also improves code generation as we can now tell RCU
   that we're holding a lock and it doesn't need to generate as much
   fencing code.  The other advantage is that tree nodes can be moved
   (not yet implemented).
 - GFP flags are now parameters to calls which may need to allocate
   memory.  The radix tree forced users to decide what the allocation
   flags would be at creation time.  It's much clearer to specify them
   at allocation time.
 - Memory is not preloaded; we don't tie up dozens of pages on the
   off chance that the slab allocator fails.  Instead, we drop the lock,
   allocate a new node and retry the operation.  We have to convert all
   the radix tree, IDA and IDR preload users before we can realise this
   benefit, but I have not yet found a user which cannot be converted.
 - The XArray provides a cmpxchg operation.  The radix tree forces users
   to roll their own (and at least four have).
 - Iterators take a 'max' parameter.  That simplifies many users and
   will reduce the amount of iteration done.
 - Iteration can proceed backwards.  We only have one user for this, but
   since it's called as part of the pagefault readahead algorithm, that
   seemed worth mentioning.
 - RCU-protected pointers are not exposed as part of the API.  There are
   some fun bugs where the page cache forgets to use rcu_dereference()
   in the current codebase.
 - Value entries gain an extra bit compared to radix tree exceptional
   entries.  That gives us the extra bit we need to put huge page swap
   entries in the page cache.
 - Some iterators now take a 'filter' argument instead of having
   separate iterators for tagged/untagged iterations.

The page cache is improved by this:
 - Shorter, easier to read code
 - More efficient iterations
 - Reduction in size of struct address_space
 - Fewer walks from the top of the data structure; the XArray API
   encourages staying at the leaf node and conducting operations there.


This patch (of 8):

None of these bits may be used for slab allocations, so we can use them as
radix tree flags as long as we mask them off before passing them to the
slab allocator.  Move the IDR flag from the high bits to the GFP_ZONEMASK
bits.

Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@xxxxxxxxxxxxx
Signed-off-by: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>
Acked-by: Jeff Layton <jlayton@xxxxxxxxxx>
Cc: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
Cc: Dave Chinner <david@xxxxxxxxxxxxx>
Cc: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 include/linux/idr.h                  |    3 ++-
 include/linux/radix-tree.h           |    7 ++++---
 lib/radix-tree.c                     |    3 ++-
 tools/testing/radix-tree/linux/gfp.h |    1 +
 4 files changed, 9 insertions(+), 5 deletions(-)

diff -puN include/linux/idr.h~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags include/linux/idr.h
--- a/include/linux/idr.h~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags
+++ a/include/linux/idr.h
@@ -29,7 +29,8 @@ struct idr {
 #define IDR_FREE	0
 
 /* Set the IDR flag and the IDR_FREE tag */
-#define IDR_RT_MARKER		((__force gfp_t)(3 << __GFP_BITS_SHIFT))
+#define IDR_RT_MARKER	(ROOT_IS_IDR | (__force gfp_t)			\
+					(1 << (ROOT_TAG_SHIFT + IDR_FREE)))
 
 #define IDR_INIT_BASE(base) {						\
 	.idr_rt = RADIX_TREE_INIT(IDR_RT_MARKER),			\
diff -puN include/linux/radix-tree.h~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags include/linux/radix-tree.h
--- a/include/linux/radix-tree.h~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags
+++ a/include/linux/radix-tree.h
@@ -104,9 +104,10 @@ struct radix_tree_node {
 	unsigned long	tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
 };
 
-/* The top bits of gfp_mask are used to store the root tags and the IDR flag */
-#define ROOT_IS_IDR	((__force gfp_t)(1 << __GFP_BITS_SHIFT))
-#define ROOT_TAG_SHIFT	(__GFP_BITS_SHIFT + 1)
+/* The IDR tag is stored in the low bits of the GFP flags */
+#define ROOT_IS_IDR	((__force gfp_t)4)
+/* The top bits of gfp_mask are used to store the root tags */
+#define ROOT_TAG_SHIFT	(__GFP_BITS_SHIFT)
 
 struct radix_tree_root {
 	gfp_t			gfp_mask;
diff -puN lib/radix-tree.c~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags lib/radix-tree.c
--- a/lib/radix-tree.c~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags
+++ a/lib/radix-tree.c
@@ -146,7 +146,7 @@ static unsigned int radix_tree_descend(c
 
 static inline gfp_t root_gfp_mask(const struct radix_tree_root *root)
 {
-	return root->gfp_mask & __GFP_BITS_MASK;
+	return root->gfp_mask & (__GFP_BITS_MASK & ~GFP_ZONEMASK);
 }
 
 static inline void tag_set(struct radix_tree_node *node, unsigned int tag,
@@ -2285,6 +2285,7 @@ void __init radix_tree_init(void)
 	int ret;
 
 	BUILD_BUG_ON(RADIX_TREE_MAX_TAGS + __GFP_BITS_SHIFT > 32);
+	BUILD_BUG_ON(ROOT_IS_IDR & ~GFP_ZONEMASK);
 	radix_tree_node_cachep = kmem_cache_create("radix_tree_node",
 			sizeof(struct radix_tree_node), 0,
 			SLAB_PANIC | SLAB_RECLAIM_ACCOUNT,
diff -puN tools/testing/radix-tree/linux/gfp.h~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags tools/testing/radix-tree/linux/gfp.h
--- a/tools/testing/radix-tree/linux/gfp.h~radix-tree-use-gfp_zonemask-bits-of-gfp_t-for-flags
+++ a/tools/testing/radix-tree/linux/gfp.h
@@ -19,6 +19,7 @@
 
 #define __GFP_RECLAIM	(__GFP_DIRECT_RECLAIM|__GFP_KSWAPD_RECLAIM)
 
+#define GFP_ZONEMASK	0x0fu
 #define GFP_ATOMIC	(__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM)
 #define GFP_KERNEL	(__GFP_RECLAIM | __GFP_IO | __GFP_FS)
 #define GFP_NOWAIT	(__GFP_KSWAPD_RECLAIM)
_
--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux