Re: [PATCH] mm: fix the race between swapin_readahead and SWP_SYNCHRONOUS_IO path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/3/2019 5:47 PM, Vinayak Menon wrote:
> On 9/3/2019 5:11 PM, Michal Hocko wrote:
>> On Tue 03-09-19 11:43:16, Vinayak Menon wrote:
>>> Hi Michal,
>>>
>>> Thanks for reviewing this.
>>>
>>>
>>> On 9/2/2019 6:51 PM, Michal Hocko wrote:
>>>> On Fri 30-08-19 18:13:31, Vinayak Menon wrote:
>>>>> The following race is observed due to which a processes faulting
>>>>> on a swap entry, finds the page neither in swapcache nor swap. This
>>>>> causes zram to give a zero filled page that gets mapped to the
>>>>> process, resulting in a user space crash later.
>>>>>
>>>>> Consider parent and child processes Pa and Pb sharing the same swap
>>>>> slot with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set.
>>>>> Virtual address 'VA' of Pa and Pb points to the shared swap entry.
>>>>>
>>>>> Pa                                       Pb
>>>>>
>>>>> fault on VA                              fault on VA
>>>>> do_swap_page                             do_swap_page
>>>>> lookup_swap_cache fails                  lookup_swap_cache fails
>>>>>                                          Pb scheduled out
>>>>> swapin_readahead (deletes zram entry)
>>>>> swap_free (makes swap_count 1)
>>>>>                                          Pb scheduled in
>>>>>                                          swap_readpage (swap_count == 1)
>>>>>                                          Takes SWP_SYNCHRONOUS_IO path
>>>>>                                          zram enrty absent
>>>>>                                          zram gives a zero filled page
>>>> This sounds like a zram issue, right? Why is a generic swap path changed
>>>> then?
>>> I think zram entry being deleted by Pa and zram giving out a zeroed page to Pb is normal.
>> Isn't that a data loss? The race you mentioned shouldn't be possible
>> with the standard swap storage AFAIU. If that is really the case then
>> the zram needs a fix rather than a generic path. Or at least a very good
>> explanation why the generic path is a preferred way.
>
> AFAIK, there isn't a data loss because, before deleting the entry, swap_slot_free_notify makes sure that
>
> page is in swapcache and marks the page dirty to ensure a swap out before reclaim. I am referring to the
>
> comment about this in swap_slot_free_notify. In the case of this race too, the page brought to swapcache
>
> by Pa is still in swapcache. It is just that Pb failed to find it due to the race.
>
> Yes, this race will not happen for standard swap storage and only for those block devices that set
>
> disk->fops->swap_slot_free_notify and have SWP_SYNCHRONOUS_IO set (which seems to be only zram).
>
> Now considering that zram works as expected, the fix is in generic path because the race is due to the bug in
>
> SWP_SYNCHRONOUS_IO handling in do_swap_page. And it is only the SWP_SYNCHRONOUS_IO handling in
>
> generic path which is modified.
>

Hi Michal,

Do you see any concerns with the patch or explanation of the problem ?







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux