On Mon, 19 Feb 2024 16:20:40 +0800 Kairui Song <ryncsn@xxxxxxxxx> wrote: > From: Kairui Song <kasong@xxxxxxxxxxx> > > When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads > swapin the same entry at the same time, they get different pages (A, B). > Before one thread (T0) finishes the swapin and installs page (A) > to the PTE, another thread (T1) could finish swapin of page (B), > swap_free the entry, then swap out the possibly modified page > reusing the same entry. It breaks the pte_same check in (T0) because > PTE value is unchanged, causing ABA problem. Thread (T0) will > install a stalled page (A) into the PTE and cause data corruption. > > @@ -3867,6 +3868,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > if (!folio) { > if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && > __swap_count(entry) == 1) { > + /* > + * Prevent parallel swapin from proceeding with > + * the cache flag. Otherwise, another thread may > + * finish swapin first, free the entry, and swapout > + * reusing the same entry. It's undetectable as > + * pte_same() returns true due to entry reuse. > + */ > + if (swapcache_prepare(entry)) { > + /* Relax a bit to prevent rapid repeated page faults */ > + schedule_timeout_uninterruptible(1); Well this is unpleasant. How often can we expect this to occur? > + goto out; > + } > + need_clear_cache = true; > + > /* skip swapcache */ > folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, > vma, vmf->address, false);