On Tue, Mar 15, 2022 at 3:09 PM Minchan Kim <minchan@xxxxxxxxxx> wrote: > I think the problem with CLONE_VM is following race > > CPU A CPU B > > do_swap_page do_swap_page > SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path > swap_readpage original data > swap_slot_free_notify > delete zram entry > swap_readpage zero data > pte_lock > map the *zero data* to userspace > pte_unlock > pte_lock > if (!pte_same) > goto out_nomap; > pte_unlock > return and next refault will > read zero data > > So, CPU A and B see zero data. With patchset below, it changes > > > CPU A CPU B > > do_swap_page do_swap_page > SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path > swap_readpage original data > pte_lock > map the original data > swap_free > swap_range_free > bd_disk->fops->swap_slot_free_notify > swap_readpage read zero data > pte_unlock > pte_lock > if (!pte_same) > goto out_nomap; > pte_unlock > return and next refault will > read correct data again > > Here, CPU A could read zero data from zram but that's not a bug > (IOW, warning injected doesn't mean bug). > > The concern of the patch would increase memory size since it could > increase wasted memory with compressed form in zram and uncompressed > form in address space. However, most of cases of zram uses no > readahead and then, do_swap_page is followed by swap_free so it will > free the compressed from in zram quickly. > > Ivan, with this patch, you can see the warning you added in the zram > but it shouldn't trigger the userspace corruption as mentioned above > if I understand correctly. > > Could you test whether the patch prevent userspace broken? I'm making an internal build and will push it to some location to see how it behaves, but it might take a few days to get any sort of confidence in the results (unless it breaks immediately). I've also pushed my patch that disables SWP_SYNCHRONOUS_IO to a few locations yesterday to see how it fares.