On Wed, Apr 30, 2014 at 06:21:14PM +0100, Catalin Marinas wrote: > On Wed, Apr 30, 2014 at 04:38:25PM +0100, Steve Capper wrote: > > On Wed, Apr 30, 2014 at 04:33:17PM +0100, Catalin Marinas wrote: > > > On Wed, Apr 30, 2014 at 04:20:47PM +0100, Catalin Marinas wrote: > > > > On Fri, Mar 28, 2014 at 03:01:31PM +0000, Steve Capper wrote: > > > > > In order to implement fast_get_user_pages we need to ensure that the > > > > > page table walker is protected from page table pages being freed from > > > > > under it. > > > > > > > > > > This patch enables HAVE_RCU_TABLE_FREE, any page table pages belonging > > > > > to address spaces with multiple users will be call_rcu_sched freed. > > > > > Meaning that disabling interrupts will block the free and protect the > > > > > fast gup page walker. > > > > > > > > > > Signed-off-by: Steve Capper <steve.capper@xxxxxxxxxx> > > > > > > > > While this patch is simple, I'd like to better understand the reason for > > > > it. Currently HAVE_RCU_TABLE_FREE is enabled for powerpc and sparc while > > > > __get_user_pages_fast() is supported by a few other architectures that > > > > don't select HAVE_RCU_TABLE_FREE. So why do we need it for fast gup on > > > > arm/arm64 while not all the other archs need it? > > > > > > OK, replying to myself. I assume the other architectures that don't need > > > HAVE_RCU_TABLE_FREE use IPI for TLB shootdown, hence they gup_fast > > > synchronisation for free. > > > > Yes that is roughly the case. > > Essentially we want to RCU free the page table backing pages at a > > later time when we aren't walking on them. > > > > Other arches use IPI, some others have their own RCU logic. I opted to > > activate some existing logic to reduce code duplication. > > Both powerpc and sparc use tlb_remove_table() via their __pte_free_tlb() > etc. which implies an IPI for synchronisation if mm_users > 1. For > gup_fast we may not need it since we use the RCU for protection. Am I > missing anything? So my understanding is: tlb_remove_table will just immediately free any pages where there's a single user as there's no need to consider a gup walking. For the case of multiple users we have an mmu_table_batch structure that holds references to pages that should be freed at a later point. This batch is contained on a page that is allocated on the fly. If, for any reason, we can't allocate the batch container we fallback to a slow path which is to issue an IPI (via tlb_remove_table_one). This IPI will block on the gup walker. We need this fallback behaviour on ARM/ARM64. Most of the time we will be able to allocate the batch container, and we will populate it with references to page table containing pages that are freed via an RCU scheduler delayed callback to tlb_remove_table_rcu. In the fast_gup walker, we block tlb_remove_table_rcu from running by disabling interrupts in the critical path. Technically we could issue a call to rcu_read_lock_sched instead to block tlb_remove_table_rcu, but that wouldn't be sufficient to block THP splits; so we opt to disable interrupts to block both THP and tlb_remove_table_rcu. Cheers, -- Steve > > -- > Catalin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>