On Tue, May 07, 2024 at 11:51:15PM -0300, Leonardo Bras wrote: > On Tue, May 07, 2024 at 05:08:54PM -0700, Sean Christopherson wrote: > > On Tue, May 07, 2024, Sean Christopherson wrote: > > > On Tue, May 07, 2024, Paul E. McKenney wrote: [ . . . ] > > > > But if we do need RCU to be more aggressive about treating guest execution as > > > > an RCU quiescent state within the host, that additional check would be an > > > > excellent way of making that happen. > > > > > > It's not clear to me that being more agressive is warranted. If my understanding > > > of the existing @user check is correct, we _could_ achieve similar functionality > > > for vCPU tasks by defining a rule that KVM must never enter an RCU critical section > > > with PF_VCPU set and IRQs enabled, and then rcu_pending() could check PF_VCPU. > > > On x86, this would be relatively straightforward (hack-a-patch below), but I've > > > no idea what it would look like on other architectures. > > > > > > But the value added isn't entirely clear to me, probably because I'm still missing > > > something. KVM will have *very* recently called __ct_user_exit(CONTEXT_GUEST) to > > > note the transition from guest to host kernel. Why isn't that a sufficient hook > > > for RCU to infer grace period completion? > > This is one of the solutions I tested when I was trying to solve the bug: > - Report quiescent state both in guest entry & guest exit. > > It improves the bug, but has 2 issues compared to the timing alternative: > 1 - Saving jiffies to a per-cpu local variable is usually cheaper than > reporting a quiescent state > 2 - If we report it on guest_exit() and some other cpu requests a grace > period in the next few cpu cycles, there is chance a timer interrupt > can trigger rcu_core() before the next guest_entry, which would > introduce unnecessary latency, and cause be the issue we are trying to > fix. > > I mean, it makes the bug reproduce less, but do not fix it. OK, then it sounds like something might be needed, but again, I must defer to you guys on the need. If there is a need, what are your thoughts on the approach that Sean suggested? Thanx, Paul