On Fri, Apr 22, 2022 at 04:00:45PM +0000, Quentin Perret wrote: > On Thursday 21 Apr 2022 at 16:40:56 (+0000), Oliver Upton wrote: > > The other option would be to not touch the subtree at all until the rcu > > callback, as at that point software will not tweak the tables any more. > > No need for atomics/spinning and can just do a boring traversal. > > Right that is sort of what I had in mind. Note that I'm still trying to > make my mind about the overall approach -- I can see how RCU protection > provides a rather elegant solution to this problem, but this makes the > whole thing inaccessible to e.g. pKVM where RCU is a non-starter. Heh, figuring out how to do this for pKVM seemed hard hence my lazy attempt :) > A > possible alternative that comes to mind would be to have all walkers > take references on the pages as they walk down, and release them on > their way back, but I'm still not sure how to make this race-safe. I'll > have a think ... Does pKVM ever collapse tables into blocks? That is the only reason any of this mess ever gets roped in. If not I think it is possible to get away with a rwlock with unmap on the write side and everything else on the read side, right? As far as regular KVM goes we get in this business when disabling dirty logging on a memslot. Guest faults will lazily collapse the tables back into blocks. An equally valid implementation would be just to unmap the whole memslot and have the guest build out the tables again, which could work with the aforementioned rwlock. Do any of my ramblings sound workable? :) -- Thanks, Oliver