Ben Hutchings <ben.hutchings@xxxxxxxxxxxxxxx> wrote: > On Sat, 2017-08-12 at 23:27 -0700, Nadav Amit wrote: >> Ben Hutchings <ben.hutchings@xxxxxxxxxxxxxxx> wrote: >> >>> On Wed, 2017-08-09 at 12:41 -0700, Greg Kroah-Hartman wrote: >>>> 4.4-stable review patch. If anyone has any objections, please let me know. >>>> >>>> ------------------ >>>> >>>> From: Mel Gorman <mgorman@xxxxxxx> >>>> >>>> commit 3ea277194daaeaa84ce75180ec7c7a2075027a68 upstream. >>> [...] >>>> +/* >>>> + * Reclaim unmaps pages under the PTL but do not flush the TLB prior to >>>> + * releasing the PTL if TLB flushes are batched. It's possible for a parallel >>>> + * operation such as mprotect or munmap to race between reclaim unmapping >>>> + * the page and flushing the page. If this race occurs, it potentially allows >>>> + * access to data via a stale TLB entry. Tracking all mm's that have TLB >>>> + * batching in flight would be expensive during reclaim so instead track >>>> + * whether TLB batching occurred in the past and if so then do a flush here >>>> + * if required. This will cost one additional flush per reclaim cycle paid >>>> + * by the first operation at risk such as mprotect and mumap. >>>> + * >>>> + * This must be called under the PTL so that an access to tlb_flush_batched >>>> + * that is potentially a "reclaim vs mprotect/munmap/etc" race will synchronise >>>> + * via the PTL. >>> >>> What about USE_SPLIT_PTE_PTLOCKS? I don't see how you can use "the PTL" >>> to synchronise access to a per-mm flag. >> >> Although it is a per-mm flag, the only situations we care about it are those >> in which “the PTL” (i.e. the same PTL) is accessed by both the reclaimer >> (which batches the flushes) and mprotect/munmap/etc. > > Is there anything that presents this sequence? > > P0 P1 P2 > -- -- -- > > change_pte_range() [ptl=X] > -> flush_tlb_batch_pending() > -> flush_tlb_mm() > try_to_unmap_one() [ptl=Y] > -> set_tlb_ubc_flush_pending() > -> tlb_flush_batched = true > -> tlb_flush_batched = false > > change_pte_range() [ptl=Y] > -> > flush_tlb_batch_pending() > (nop) I think (but not sure) that you regard a similar concern I raised before. Mel gave an answer [1], but I cannot say I feel very comfortable with it. [1] http://www.spinics.net/lists/linux-mm/msg131265.html Nadav