On Sat, 2017-08-12 at 23:27 -0700, Nadav Amit wrote: > Ben Hutchings <ben.hutchings@xxxxxxxxxxxxxxx> wrote: > > > On Wed, 2017-08-09 at 12:41 -0700, Greg Kroah-Hartman wrote: > >> 4.4-stable review patch. If anyone has any objections, please let me know. > >> > >> ------------------ > >> > >> From: Mel Gorman <mgorman@xxxxxxx> > >> > >> commit 3ea277194daaeaa84ce75180ec7c7a2075027a68 upstream. > > [...] > >> +/* > >> + * Reclaim unmaps pages under the PTL but do not flush the TLB prior to > >> + * releasing the PTL if TLB flushes are batched. It's possible for a parallel > >> + * operation such as mprotect or munmap to race between reclaim unmapping > >> + * the page and flushing the page. If this race occurs, it potentially allows > >> + * access to data via a stale TLB entry. Tracking all mm's that have TLB > >> + * batching in flight would be expensive during reclaim so instead track > >> + * whether TLB batching occurred in the past and if so then do a flush here > >> + * if required. This will cost one additional flush per reclaim cycle paid > >> + * by the first operation at risk such as mprotect and mumap. > >> + * > >> + * This must be called under the PTL so that an access to tlb_flush_batched > >> + * that is potentially a "reclaim vs mprotect/munmap/etc" race will synchronise > >> + * via the PTL. > > > > What about USE_SPLIT_PTE_PTLOCKS? I don't see how you can use "the PTL" > > to synchronise access to a per-mm flag. > > Although it is a per-mm flag, the only situations we care about it are those > in which “the PTL” (i.e. the same PTL) is accessed by both the reclaimer > (which batches the flushes) and mprotect/munmap/etc. Is there anything that presents this sequence? P0 P1 P2 -- -- -- change_pte_range() [ptl=X] -> flush_tlb_batch_pending() -> flush_tlb_mm() try_to_unmap_one() [ptl=Y] -> set_tlb_ubc_flush_pending() -> tlb_flush_batched = true -> tlb_flush_batched = false change_pte_range() [ptl=Y] -> flush_tlb_batch_pending() (nop) Ben. -- Ben Hutchings Software Developer, Codethink Ltd.