On (22/03/30 14:02), Sergey Senozhatsky wrote: > On (22/03/25 20:43), Andrew Morton wrote: > > Two processes under CLONE_VM cloning, user process can be corrupted by > > seeing zeroed page unexpectedly. > > > > CPU A CPU B > > > > do_swap_page do_swap_page > > SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path > > swap_readpage valid data > > swap_slot_free_notify > > delete zram entry > > swap_readpage zeroed(invalid) data > > pte_lock > > map the *zero data* to userspace > > pte_unlock > > pte_lock > > if (!pte_same) > > goto out_nomap; > > pte_unlock > > return and next refault will > > read zeroed data > > > > The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't > > increase the refcount of swap slot at copy_mm so it couldn't catch up > > whether it's safe or not to discard data from backing device. In the > > case, only the lock it could rely on to synchronize swap slot freeing is > > page table lock. Thus, this patch gets rid of the swap_slot_free_notify > > function. With this patch, CPU A will see correct data. > > > > CPU A CPU B > > > > do_swap_page do_swap_page > > SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path > > swap_readpage original data > > pte_lock > > map the original data > > swap_free > > swap_range_free > > bd_disk->fops->swap_slot_free_notify > > swap_readpage read zeroed data > > pte_unlock > > pte_lock > > if (!pte_same) > > goto out_nomap; > > pte_unlock > > return > > on next refault will see mapped data by CPU B > > > > The concern of the patch would increase memory consumption since it could > > keep wasted memory with compressed form in zram as well as uncompressed > > form in address space. However, most of cases of zram uses no readahead > > and do_swap_page is followed by swap_free so it will free the compressed > > form from in zram quickly. > > Minchan, a quick question, shouldn't this instead revert 3f2b1a04f4493? Never mind! My bad. The patch looks good to me.