On Tue, 27 Feb 2024 17:00:42 +0000 David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: > On 27 February 2024 14:43:26 GMT, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > >On Tue, 27 Feb 2024 11:49:21 +0000 > >David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: > > > >> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > >> index c16b6d394d55..d8b5326ecebc 100644 > >> --- a/arch/x86/kvm/xen.c > >> +++ b/arch/x86/kvm/xen.c > >> @@ -1736,9 +1736,23 @@ static int set_shinfo_evtchn_pending(struct kvm_vcpu *vcpu, u32 port) > >> unsigned long flags; > >> int rc = -EWOULDBLOCK; > >> > >> - read_lock_irqsave(&gpc->lock, flags); > >> + local_irq_save(flags); > >> + if (!read_trylock(&gpc->lock)) { > > > >Note, directly disabling interrupts in PREEMPT_RT is just as bad as doing > >so in non RT and the taking a mutex. > > Well, it *isn't* a mutex in non-RT. It's basically a spinlock. And even > though RT turns it into a mutex this is only a trylock with a fall back > to a slow path in process context, so it ends up being a very short > period of irqdisable/lock – just to validate the target address of the > cache, and write there. (Or fall back to that slow path, if the cache > needs revalidating). Yes, but if the trylock fails, you then call: + if (!read_trylock(&gpc->lock)) { + /* + * When PREEMPT_RT turns locks into mutexes, rwlocks are + * turned into mutexes and most interrupts are threaded. + * But timer events may be delivered in hardirq mode due + * to using HRTIMER_MODE_ABS_HARD. So bail to the slow + * path if the trylock fails in interrupt context. + */ + if (in_interrupt()) + goto out; + + read_lock(&gpc->lock); That takes the read_lock() with interrupts still disabled. On RT, that will likely schedule as it had just failed a trylock. + } + > > So I think this is probably OK, but... > > > >Worse yet, in PREEMPT_RT read_unlock_irqrestore() isn't going to > >re-enable interrupts. > > ... that's kind of odd. I think there's code elsewhere which assumes it > will, and pairs it with an explicit local_irq_save() like the above, in a > different atomic context (updating runstate time info when being > descheduled). So *that* needs changing for RT? I see Thomas replied. I'll leave the rest to him. -- Steve