Minchan Kim <minchan@xxxxxxxxxx> writes: > On Wed, Dec 20, 2017 at 09:26:32AM +0800, Huang, Ying wrote: >> From: Huang Ying <ying.huang@xxxxxxxxx> >> >> When the swapin is performed, after getting the swap entry information >> from the page table, system will swap in the swap entry, without any >> lock held to prevent the swap device from being swapoff. This may >> cause the race like below, >> >> CPU 1 CPU 2 >> ----- ----- >> do_swap_page >> swapin_readahead >> __read_swap_cache_async >> swapoff swapcache_prepare >> p->swap_map = NULL __swap_duplicate >> p->swap_map[?] /* !!! NULL pointer access */ >> >> Because swapoff is usually done when system shutdown only, the race >> may not hit many people in practice. But it is still a race need to >> be fixed. >> >> To fix the race, get_swap_device() is added to check whether the >> specified swap entry is valid in its swap device. If so, it will keep >> the swap entry valid via preventing the swap device from being >> swapoff, until put_swap_device() is called. >> >> Because swapoff() is very race code path, to make the normal path runs >> as fast as possible, RCU instead of reference count is used to >> implement get/put_swap_device(). From get_swap_device() to >> put_swap_device(), the RCU read lock is held, so synchronize_rcu() in >> swapoff() will wait until put_swap_device() is called. >> >> In addition to swap_map, cluster_info, etc. data structure in the >> struct swap_info_struct, the swap cache radix tree will be freed after >> swapoff, so this patch fixes the race between swap cache looking up >> and swapoff too. >> >> Cc: Hugh Dickins <hughd@xxxxxxxxxx> >> Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> >> Cc: Minchan Kim <minchan@xxxxxxxxxx> >> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> >> Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> >> Cc: Shaohua Li <shli@xxxxxx> >> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> >> Cc: "Jrme Glisse" <jglisse@xxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxx> >> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> >> Cc: David Rientjes <rientjes@xxxxxxxxxx> >> Cc: Rik van Riel <riel@xxxxxxxxxx> >> Cc: Jan Kara <jack@xxxxxxx> >> Cc: Dave Jiang <dave.jiang@xxxxxxxxx> >> Cc: Aaron Lu <aaron.lu@xxxxxxxxx> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> >> >> Changelog: >> >> v4: >> >> - Use synchronize_rcu() in enable_swap_info() to reduce overhead of >> normal paths further. > > Hi Huang, Hi, Minchan, > This version is much better than old. To me, it's due to not rcu, > srcu, refcount thing but it adds swap device dependency(i.e., get/put) > into every swap related functions so users who don't interested on swap > don't need to care of it. Good. > > The problem is caused by freeing by swap related-data structure > *dynamically* while old swap logic was based on static data > structure(i.e., never freed and the verify it's stale). > So, I reviewed some places where use PageSwapCache and swp_entry_t > which could make access of swap related data structures. > > A example is __isolate_lru_page > > It calls page_mapping to get a address_space. > What happens if the page is on SwapCache and raced with swapoff? > The mapping got could be disappeared by the race. Right? Yes. We should think about that. Considering the file cache pages, the address_space backing the file cache pages may be freed dynamically too. So to use page_mapping() return value for the file cache pages, some kind of locking is needed to guarantee the address_space isn't freed under us. Page may be locked, or under writeback, or some other locks need to be held, for example, page table lock, or lru_lock, etc. For __isolate_lru_page(), lru_lock will be held when it is called. And we will call synchronize_rcu() between clear PageSwapCache and free swap cache, so the usage of swap cache in __isolate_lru_page() should be safe. Do you think my analysis makes sense? Best Regards, Huang, Ying -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>