On 9/3/2019 5:47 PM, Vinayak Menon wrote: > On 9/3/2019 5:11 PM, Michal Hocko wrote: >> On Tue 03-09-19 11:43:16, Vinayak Menon wrote: >>> Hi Michal, >>> >>> Thanks for reviewing this. >>> >>> >>> On 9/2/2019 6:51 PM, Michal Hocko wrote: >>>> On Fri 30-08-19 18:13:31, Vinayak Menon wrote: >>>>> The following race is observed due to which a processes faulting >>>>> on a swap entry, finds the page neither in swapcache nor swap. This >>>>> causes zram to give a zero filled page that gets mapped to the >>>>> process, resulting in a user space crash later. >>>>> >>>>> Consider parent and child processes Pa and Pb sharing the same swap >>>>> slot with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set. >>>>> Virtual address 'VA' of Pa and Pb points to the shared swap entry. >>>>> >>>>> Pa Pb >>>>> >>>>> fault on VA fault on VA >>>>> do_swap_page do_swap_page >>>>> lookup_swap_cache fails lookup_swap_cache fails >>>>> Pb scheduled out >>>>> swapin_readahead (deletes zram entry) >>>>> swap_free (makes swap_count 1) >>>>> Pb scheduled in >>>>> swap_readpage (swap_count == 1) >>>>> Takes SWP_SYNCHRONOUS_IO path >>>>> zram enrty absent >>>>> zram gives a zero filled page >>>> This sounds like a zram issue, right? Why is a generic swap path changed >>>> then? >>> I think zram entry being deleted by Pa and zram giving out a zeroed page to Pb is normal. >> Isn't that a data loss? The race you mentioned shouldn't be possible >> with the standard swap storage AFAIU. If that is really the case then >> the zram needs a fix rather than a generic path. Or at least a very good >> explanation why the generic path is a preferred way. > > AFAIK, there isn't a data loss because, before deleting the entry, swap_slot_free_notify makes sure that > > page is in swapcache and marks the page dirty to ensure a swap out before reclaim. I am referring to the > > comment about this in swap_slot_free_notify. In the case of this race too, the page brought to swapcache > > by Pa is still in swapcache. It is just that Pb failed to find it due to the race. > > Yes, this race will not happen for standard swap storage and only for those block devices that set > > disk->fops->swap_slot_free_notify and have SWP_SYNCHRONOUS_IO set (which seems to be only zram). > > Now considering that zram works as expected, the fix is in generic path because the race is due to the bug in > > SWP_SYNCHRONOUS_IO handling in do_swap_page. And it is only the SWP_SYNCHRONOUS_IO handling in > > generic path which is modified. > Hi Michal, Do you see any concerns with the patch or explanation of the problem ?