On 2021/4/12 9:44, Huang, Ying wrote: > Miaohe Lin <linmiaohe@xxxxxxxxxx> writes: > >> On 2021/4/10 1:17, Tim Chen wrote: >>> >>> >>> On 4/9/21 1:42 AM, Miaohe Lin wrote: >>>> On 2021/4/9 5:34, Tim Chen wrote: >>>>> >>>>> >>>>> On 4/8/21 6:08 AM, Miaohe Lin wrote: >>>>>> When I was investigating the swap code, I found the below possible race >>>>>> window: >>>>>> >>>>>> CPU 1 CPU 2 >>>>>> ----- ----- >>>>>> do_swap_page >>>>>> synchronous swap_readpage >>>>>> alloc_page_vma >>>>>> swapoff >>>>>> release swap_file, bdev, or ... >>>>> >>>> >>>> Many thanks for quick review and reply! >>>> >>>>> Perhaps I'm missing something. The release of swap_file, bdev etc >>>>> happens after we have cleared the SWP_VALID bit in si->flags in destroy_swap_extents >>>>> if I read the swapoff code correctly. >>>> Agree. Let's look this more close: >>>> CPU1 CPU2 >>>> ----- ----- >>>> swap_readpage >>>> if (data_race(sis->flags & SWP_FS_OPS)) { >>>> swapoff >>>> p->swap_file = NULL; >>>> struct file *swap_file = sis->swap_file; >>>> struct address_space *mapping = swap_file->f_mapping;[oops!] >>>> ... >>>> p->flags = 0; >>>> ... >>>> >>>> Does this make sense for you? >>> >>> p->swapfile = NULL happens after the >>> p->flags &= ~SWP_VALID, synchronize_rcu(), destroy_swap_extents() sequence in swapoff(). >>> >>> So I don't think the sequence you illustrated on CPU2 is in the right order. >>> That said, without get_swap_device/put_swap_device in swap_readpage, you could >>> potentially blow pass synchronize_rcu() on CPU2 and causes a problem. so I think >>> the problematic race looks something like the following: >>> >>> >>> CPU1 CPU2 >>> ----- ----- >>> swap_readpage >>> if (data_race(sis->flags & SWP_FS_OPS)) { >>> swapoff >>> p->flags = &= ~SWP_VALID; >>> .. >>> synchronize_rcu(); >>> .. >>> p->swap_file = NULL; >>> struct file *swap_file = sis->swap_file; >>> struct address_space *mapping = swap_file->f_mapping;[oops!] >>> ... >>> ... >>> >> >> Agree. This is also what I meant to illustrate. And you provide a better one. Many thanks! > > For the pages that are swapped in through swap cache. That isn't an > issue. Because the page is locked, the swap entry will be marked with > SWAP_HAS_CACHE, so swapoff() cannot proceed until the page has been > unlocked. > > So the race is for the fast path as follows, > > if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && > __swap_count(entry) == 1) > > I found it in your original patch description. But please make it more > explicit to reduce the potential confusing. Sure. Should I rephrase the commit log to clarify this or add a comment in the code? Thanks. > > Best Regards, > Huang, Ying > . >