On Thu, Jan 29, 2015 at 12:06:44PM -0500, Steven Rostedt wrote: > On Wed, 28 Jan 2015 10:55:53 -0800 > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > > > Then your only hope is to prevent the host (and other guests) from > > preempting the real-time guest. > > Right! > > I think there's a miscommunication here. I can easily believe that! > Basically what is needed is to run the RT guest on a CPU by itself. We > can all agree on that. That guest runs at a high priority where nothing > should preempt it. We should enable NO_HZ_FULL, and move as much off of > that CPU as possible (including rcu callbacks). > > I'm not sure if the code does this or not, but I believe it does. When > we enter the guest, the host should be in an RCU quiescent state, where > RCU will ignore the CPU that is running the guest. Remember, we are only > talking about interactions of the host, not the workings of the guest. NO_HZ_FULL will automatically tell RCU about the guest-execution quiescent state because the guest is seen by the host as user-mode execution. (Right? Or is KVM treating this specially such that RCU doesn't see guest execution as a quiescent state? I think this is currently handled correctly, because if it wasn't, you would get RCU CPU stall warning messages.) > Once this isolation happens, then the guest should be running in a > state that it could handle RT reaction times for its own processes (if > the guest OS supports it). The guest shouldn't be preempted by anything > unless it does something that requires a service (interacting with the > network or other baremetal device), then it will need to do the same > things that any RT task must do. Agreed! > I think all this is feasible. The one thing that gives me pause is the high contention on the root (AKA only) rcu_node structure's ->lock field. If this persists, one thing to try would be to build with CONFIG_RCU_FANOUT_LEAF=8 (or 4). If that helps, it would be worthwhile to do some tracing or lock profiling to see about reducing the ->lock contention for the default CONFIG_RCU_FANOUT_LEAF=16. My first thought when I saw the high contention was to introduce funnel locking for grace-period start, but that is unlikely to help in cases where there is only one rcu_node structure. ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html