On Tue, Nov 15, 2022 at 10:47:37AM -0800, Ricardo Koller wrote: > On Wed, Nov 09, 2022 at 11:55:31PM +0000, Oliver Upton wrote: > > On Wed, Nov 09, 2022 at 09:53:45PM +0000, Sean Christopherson wrote: > > > On Mon, Nov 07, 2022, Oliver Upton wrote: > > > > Use RCU to safely walk the stage-2 page tables in parallel. Acquire and > > > > release the RCU read lock when traversing the page tables. Defer the > > > > freeing of table memory to an RCU callback. Indirect the calls into RCU > > > > and provide stubs for hypervisor code, as RCU is not available in such a > > > > context. > > > > > > > > The RCU protection doesn't amount to much at the moment, as readers are > > > > already protected by the read-write lock (all walkers that free table > > > > memory take the write lock). Nonetheless, a subsequent change will > > > > futher relax the locking requirements around the stage-2 MMU, thereby > > > > depending on RCU. > > > > > > Two somewhat off-topic questions (because I'm curious): > > > > Worth asking! > > > > > 1. Are there plans to enable "fast" page faults on ARM? E.g. to fixup access > > > faults (handle_access_fault()) and/or write-protection faults without acquiring > > > mmu_lock? > > > > I don't have any plans personally. > > > > OTOH, adding support for read-side access faults is trivial, I just > > didn't give it much thought as most large-scale implementations have > > FEAT_HAFDBS (hardware access flag management). > > WDYT of permission relaxation (write-protection faults) on the fast > path? > > The benefits won't be as good as in x86 due to the required TLBI, but > may be worth it due to not dealing with the mmu lock and avoiding some > of the context save/restore. Note that unlike x86, in ARM the TLB entry > related to a protection fault needs to be flushed. Right, the only guarantee we have on arm64 is that the TLB will never hold an entry that would produce an access fault. I have no issues whatsoever with implementing a lock-free walker, we're already most of the way there with the RCU implementation modulo some rules for atomic PTE updates. I don't believe lock acquisition is a bounding issue for us quite yet as break-before-make + lazy splitting hurts. -- Thanks, Oliver