On Tue, Aug 01, 2017 at 05:08:17PM -0700, Nadav Amit wrote: > void tlb_finish_mmu(struct mmu_gather *tlb, > unsigned long start, unsigned long end) > { > - arch_tlb_finish_mmu(tlb, start, end); > + /* > + * If there are parallel threads are doing PTE changes on same range > + * under non-exclusive lock(e.g., mmap_sem read-side) but defer TLB > + * flush by batching, a thread has stable TLB entry can fail to flush > + * the TLB by observing pte_none|!pte_dirty, for example so flush TLB > + * forcefully if we detect parallel PTE batching threads. > + */ > + bool force = mm_tlb_flush_nested(tlb->mm); > + > + arch_tlb_finish_mmu(tlb, start, end, force); > } I don't understand the comment nor the ordering. What guarantees we see the increment if we need to?