On 1/31/22 17:29, David Hildenbrand wrote: > Currently we have a different COW logic when: > * triggering a read-fault to swapin first and then trigger a write-fault > -> do_swap_page() + do_wp_page() > * triggering a write-fault to swapin > -> do_swap_page() + do_wp_page() only if we fail reuse in do_swap_page() > > The COW logic in do_swap_page() is different than our reuse logic in > do_wp_page(). The COW logic in do_wp_page() -- page_count() == 1 -- makes > currently sure that we certainly don't have a remaining reference, e.g., > via GUP, on the target page we want to reuse: if there is any unexpected > reference, we have to copy to avoid information leaks. > > As do_swap_page() behaves differently, in environments with swap enabled we > can currently have an unintended information leak from the parent to the > child, similar as known from CVE-2020-29374: > > 1. Parent writes to anonymous page > -> Page is mapped writable and modified > 2. Page is swapped out > -> Page is unmapped and replaced by swap entry > 3. fork() > -> Swap entries are copied to child > 4. Child pins page R/O > -> Page is mapped R/O into child > 5. Child unmaps page > -> Child still holds GUP reference > 6. Parent writes to page > -> Page is reused in do_swap_page() > -> Child can observe changes > > Exchanging 2. and 3. should have the same effect. > > Let's apply the same COW logic as in do_wp_page(), conditionally trying to > remove the page from the swapcache after freeing the swap entry, however, > before actually mapping our page. We can change the order now that > we use try_to_free_swap(), which doesn't care about the mapcount, > instead of reuse_swap_page(). > > To handle references from the LRU pagevecs, conditionally drain the local > LRU pagevecs when required, however, don't consider the page_count() when > deciding whether to drain to keep it simple for now. > > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx>