On 1/4/25 08:27, Peter Zijlstra wrote: >> Or should we make this unconditional on all native because we don't care about >> the overhead and would like to have simpler code. I mean, disabling IRQs vs >> batching and allocating memory...? > The disabling IRQs on the GUP-fast side stays, it acts as a > RCU-read-side section -- also mmu_gather reverts to sending IPIs if it > runs out of memory (extremely rare). > > I don't think there is measurable overhead from doing the separate table > batching, but I'm sure the robots will tell us. We should _try_ to make it unconditional for simplicity if nothing else. BTW, a few years back, some folks at Intel turned on MMU_GATHER_RCU_TABLE_FREE and ran the usual 0day/LKP tests. I _think_ it was when we were exploring the benefits of Intel's IPI-free TLB flushing mechanism. We didn't find anything remarkable either way (IIRC).