On Mon, Apr 09, 2018 at 07:41:52PM -0700, Matthew Wilcox wrote: > On Tue, Apr 10, 2018 at 11:33:39AM +0900, Minchan Kim wrote: > > @@ -522,7 +532,7 @@ EXPORT_SYMBOL(radix_tree_preload); > > */ > > int radix_tree_maybe_preload(gfp_t gfp_mask) > > { > > - if (gfpflags_allow_blocking(gfp_mask)) > > + if (gfpflags_allow_blocking(gfp_mask) && !(gfp_mask & __GFP_ZERO)) > > return __radix_tree_preload(gfp_mask, RADIX_TREE_PRELOAD_SIZE); > > /* Preloading doesn't help anything with this gfp mask, skip it */ > > preempt_disable(); > > No, you've completely misunderstood what's going on in this function. Okay, I hope this version clear current concerns. >From fb37c41b90f7d3ead1798e5cb7baef76709afd94 Mon Sep 17 00:00:00 2001 From: Minchan Kim <minchan@xxxxxxxxxx> Date: Tue, 10 Apr 2018 11:54:57 +0900 Subject: [PATCH v3] mm: workingset: fix NULL ptr dereference It assumes shadow entries of radix tree rely on the init state that node->private_list allocated newly is list_empty state for the working. Currently, it's initailized in SLAB constructor which means node of radix tree would be initialized only when *slub allocates new page*, not *slub alloctes new object*. If some FS or subsystem pass gfp_mask to __GFP_ZERO, that means newly allocated node can have !list_empty(node->private_list) by memset of slab allocator. It ends up calling NULL deference at workingset_update_node by failing list_empty check. This patch fixes it. Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Jan Kara <jack@xxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Jaegeuk Kim <jaegeuk@xxxxxxxxxx> Cc: Chao Yu <yuchao0@xxxxxxxxxx> Cc: Christopher Lameter <cl@xxxxxxxxx> Cc: linux-fsdevel@xxxxxxxxxxxxxxx Cc: stable@xxxxxxxxxxxxxxx Reported-by: Chris Fries <cfries@xxxxxxxxxx> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> --- lib/radix-tree.c | 9 +++++++++ mm/filemap.c | 5 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index da9e10c827df..7569e637dbaa 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -470,6 +470,15 @@ static __must_check int __radix_tree_preload(gfp_t gfp_mask, unsigned nr) struct radix_tree_node *node; int ret = -ENOMEM; + /* + * New allocate node must have node->private_list as INIT_LIST_HEAD + * state by workingset shadow memory implementation. + * If user pass __GFP_ZERO by mistake, slab allocator will clear + * node->private_list, which makes a BUG. Rather than going Oops, + * just fix and warn about it. + */ + if (WARN_ON(gfp_mask & __GFP_ZERO)) + gfp_mask &= ~__GFP_ZERO; /* * Nodes preloaded by one cgroup can be be used by another cgroup, so * they should never be accounted to any particular memory cgroup. diff --git a/mm/filemap.c b/mm/filemap.c index ab77e19ab09c..b6de9d691c8a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -786,7 +786,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) VM_BUG_ON_PAGE(!PageLocked(new), new); VM_BUG_ON_PAGE(new->mapping, new); - error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_preload(gfp_mask & ~(__GFP_HIGHMEM | __GFP_ZERO)); if (!error) { struct address_space *mapping = old->mapping; void (*freepage)(struct page *); @@ -842,7 +842,8 @@ static int __add_to_page_cache_locked(struct page *page, return error; } - error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_maybe_preload(gfp_mask & + ~(__GFP_HIGHMEM | __GFP_ZERO)); if (error) { if (!huge) mem_cgroup_cancel_charge(page, memcg, false); -- 2.17.0.484.g0c8726318c-goog