On Mon 12-02-18 16:12:27, Huang, Ying wrote: > From: Huang Ying <ying.huang@xxxxxxxxx> > > When page_mapping() is called and the mapping is dereferenced in > page_evicatable() through shrink_active_list(), it is possible for the > inode to be truncated and the embedded address space to be freed at > the same time. This may lead to the following race. > > CPU1 CPU2 > > truncate(inode) shrink_active_list() > ... page_evictable(page) > truncate_inode_page(mapping, page); > delete_from_page_cache(page) > spin_lock_irqsave(&mapping->tree_lock, flags); > __delete_from_page_cache(page, NULL) > page_cache_tree_delete(..) > ... mapping = page_mapping(page); > page->mapping = NULL; > ... > spin_unlock_irqrestore(&mapping->tree_lock, flags); > page_cache_free_page(mapping, page) > put_page(page) > if (put_page_testzero(page)) -> false > - inode now has no pages and can be freed including embedded address_space > > mapping_unevictable(mapping) > test_bit(AS_UNEVICTABLE, &mapping->flags); > - we've dereferenced mapping which is potentially already free. > > Similar race exists between swap cache freeing and page_evicatable() too. > > The address_space in inode and swap cache will be freed after a RCU > grace period. So the races are fixed via enclosing the page_mapping() > and address_space usage in rcu_read_lock/unlock(). Some comments are > added in code to make it clear what is protected by the RCU read lock. > > Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: "Huang, Ying" <ying.huang@xxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> The race looks real (although very unlikely) and the patch looks good to me. You can add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > mm/vmscan.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index d1c1e00b08bb..10a0f32a3f90 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3886,7 +3886,13 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order) > */ > int page_evictable(struct page *page) > { > - return !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); > + int ret; > + > + /* Prevent address_space of inode and swap cache from being freed */ > + rcu_read_lock(); > + ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); > + rcu_read_unlock(); > + return ret; > } > > #ifdef CONFIG_SHMEM > -- > 2.15.1 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>