Andy Lutomirski <luto@xxxxxxxxxx> wrote: > On Mon, Jul 17, 2017 at 6:40 PM, Nadav Amit <nadav.amit@xxxxxxxxx> wrote: >> Andy Lutomirski <luto@xxxxxxxxxx> wrote: >> >>> On Mon, Jul 17, 2017 at 11:02 AM, Nadav Amit <namit@xxxxxxxxxx> wrote: >>>> Setting and clearing mm->tlb_flush_pending can be performed by multiple >>>> threads, since mmap_sem may only be acquired for read in task_numa_work. >>>> If this happens, tlb_flush_pending may be cleared while one of the >>>> threads still changes PTEs and batches TLB flushes. >>>> >>>> As a result, TLB flushes can be skipped because the indication of >>>> pending TLB flushes is lost, for instance due to race between >>>> migration and change_protection_range (just as in the scenario that >>>> caused the introduction of tlb_flush_pending). >>>> >>>> The feasibility of such a scenario was confirmed by adding assertion to >>>> check tlb_flush_pending is not set by two threads, adding artificial >>>> latency in change_protection_range() and using sysctl to reduce >>>> kernel.numa_balancing_scan_delay_ms. >>> >>> This thing is logically a refcount. Should it be refcount_t? >> >> I don’t think so. refcount_inc() would WARN_ONCE if the counter is zero >> before the increase, although this is a valid scenario here. > > Hmm. Maybe a refcount that starts at 1? My point is that, if someone > could force it to overflow, it would be bad. Maybe this isn't worth > worrying about. I don’t think it is a issue. At most you can have one task_numa_work() per core running in any given moment. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href