On Thu, May 20, 2021 at 10:38:29AM +0200, Halil Pasic wrote: > > It eliminates one atomic from a lock, but > > if you go on to immediately do something like try_module_get, which > > has an atomic inside, the whole thing is pointless (assuming > > try_module_get was even the right thing to do) > > I'm not sure about this argument, or that RCU should be used only for > highly performance sensitive read workloads. Fundamentally RCU is a technique for building a read/write lock that avoids an atomic incr on the read side. This comes at the cost of significantly slowing down the write side. Everything about RCU is very complicated, people typically use it wrong. People use it wrong so often than Paul wrote a very long list of guidelines to help understand if it is being used properly: Documentation/RCU/checklist.rst If you haven't memorized this document then you probably shouldn't be using RCU at all :) > Can you please elaborate on the argument above and also put your > statement in perspective with https://lwn.net/Articles/263130/? This article doesn't talk much about the downsides - and focuses on performance win. If you need the performance then sure use RCU, but RCU is not a general purpose replacement for a rwsem. People should *always* reach for a rwsem first. Design that properly and then ask themselves if the rwsem can or should be optimized to a RCU. > Yes a simple sleepable lock would work, and we wouldn't need > get_module(). Because before the vfio_ap module unloads, it > sets all vcpu->kvm->arch.crypto.pqap_hook instances to NULL. So if > we know that vcpu->kvm->arch.crypto.pqap_hook then we know > that the still have valid references to the module. Yes > But in my opinion RCU is also viable (not this patch). I think, I prefer > a lock for simplicity, unless it is not (deadlocks) ;). As soon as you said you needed to sleep RCU stopped being viable. To make a sleepable region you need to do an atomic write and at that instant all the value of RCU was destroyed - use a normal rwsem. There is SRCU that solves the sleepable problem, but it has an incredible cost in both write side performance and memory usage that should only be reached for if high read side performance is really required. Jason