On Thu, Feb 25, 2016 at 10:50:14PM -0800, Hugh Dickins wrote: > It's a useful suggestion from Gerald, and your THP rework may have > brought us closer to being able to rely on RCU locking rather than > IRQ disablement there; but you were right just to delete the comment, > there are other reasons why fast GUP still depends on IRQs disabled. > > For example, see the fallback tlb_remove_table_one() in mm/memory.c: > that one uses smp_call_function() sending IPI to all CPUs concerned, > without waiting an RCU grace period at all. I full agree, the refcounting change just drops the THP splitting from the equation, but everything else remains. It's not like x86 is using RCU for gup_fast when CONFIG_TRANSPARENT_HUGEPAGE=n. The main issue Peter also pointed out is how it can be faster to wait a RCU grace period than sending an IPI to only the CPU that have an active_mm matching the one the page belongs to and I'm not exactly sure the cost of disabling irqs in gup_fast is going to pay off. It's not just swap, large munmap should be able to free up pagetables or pagetables would get a footprint out of proportion with the Rss of the process, and in turn it'll have to either block synchronously for long before returning to userland, or return to userland when the pagetable memory is still not free, and userland may mmap again and munmap again in a loop and being legit doing so too, with unclear side effects with regard to false positive OOM. Then there's another issue with synchronize_sched(), __get_user_pages_fast has to safe to run from irq (note the local_irq_save instead of local_irq_disable) and KVM leverages it. KVM just requires it to be atomic so it can run from inside a preempt disabled section (i.e. inside a spinlock), I'm fairly certain the irq-safe guarantee could be dropped without pain and rcu_read_lock_sched() would be enough, but the documentation of the IRQ-safe guarantees provided by __get_user_pages_fast should be also altered if we were to use synchronize_sched() and that's a symbol exported to GPL modules too. Overall my main concern in switching x86 to RCU gup-fast is the performance of synchronize_sched in large munmap pagetable teardown. Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>