On Tue 03-09-19 11:43:16, Vinayak Menon wrote: > Hi Michal, > > Thanks for reviewing this. > > > On 9/2/2019 6:51 PM, Michal Hocko wrote: > > On Fri 30-08-19 18:13:31, Vinayak Menon wrote: > >> The following race is observed due to which a processes faulting > >> on a swap entry, finds the page neither in swapcache nor swap. This > >> causes zram to give a zero filled page that gets mapped to the > >> process, resulting in a user space crash later. > >> > >> Consider parent and child processes Pa and Pb sharing the same swap > >> slot with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set. > >> Virtual address 'VA' of Pa and Pb points to the shared swap entry. > >> > >> Pa Pb > >> > >> fault on VA fault on VA > >> do_swap_page do_swap_page > >> lookup_swap_cache fails lookup_swap_cache fails > >> Pb scheduled out > >> swapin_readahead (deletes zram entry) > >> swap_free (makes swap_count 1) > >> Pb scheduled in > >> swap_readpage (swap_count == 1) > >> Takes SWP_SYNCHRONOUS_IO path > >> zram enrty absent > >> zram gives a zero filled page > > This sounds like a zram issue, right? Why is a generic swap path changed > > then? > > > I think zram entry being deleted by Pa and zram giving out a zeroed page to Pb is normal. Isn't that a data loss? The race you mentioned shouldn't be possible with the standard swap storage AFAIU. If that is really the case then the zram needs a fix rather than a generic path. Or at least a very good explanation why the generic path is a preferred way. -- Michal Hocko SUSE Labs