Re: KVM Arm64 and Linux-RT issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Julien,

On 23/07/2019 18:58, Julien Grall wrote:
> Hi all,
> 
> I have been playing with the latest branch of Linux RT (5.2-rt1) and notice the 
> following splat when starting a KVM guest.
> 
> [  122.336254] 003: BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:968
> [  122.336263] 003: in_atomic(): 1, irqs_disabled(): 0, pid: 1430, name: kvm-vcpu-1
> [  122.336267] 003: 2 locks held by kvm-vcpu-1/1430:
> [  122.336271] 003:  #0: ffff8007c1518100 (&vcpu->mutex){+.+.}, at: 
> kvm_vcpu_ioctl+0x70/0xae0
> [  122.336287] 003:  #1: ffff8007fb08b478 
> (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
> [  122.336299] 003: Preemption disabled at:
> [  122.336300] 003: [<ffff0000111a44e8>] schedule+0x30/0xd8
> [  122.336308] 003: CPU: 3 PID: 1430 Comm: kvm-vcpu-1 Tainted: G        W 
> 5.2.0-rt1-00008-g5bc0332820fd #88
> [  122.336311] 003: Hardware name: AMD Seattle (Rev.B0) Development Board 
> (Overdrive) (DT)
> [  122.336314] 003: Call trace:
> [  122.336315] 003:  dump_backtrace+0x0/0x130
> [  122.336319] 003:  show_stack+0x14/0x20
> [  122.336321] 003:  dump_stack+0xbc/0x104
> [  122.336324] 003:  ___might_sleep+0x198/0x238
> [  122.336327] 003:  rt_spin_lock+0x5c/0x70
> [  122.336330] 003:  hrtimer_grab_expiry_lock+0x24/0x40
> [  122.336332] 003:  hrtimer_cancel+0x1c/0x38
> [  122.336334] 003:  kvm_timer_vcpu_load+0x78/0x3e0
> [  122.336338] 003:  kvm_arch_vcpu_load+0x130/0x298
> [  122.336340] 003:  kvm_sched_in+0x38/0x68
> [  122.336342] 003:  finish_task_switch+0x14c/0x300
> [  122.336344] 003:  __schedule+0x2b8/0x8d0
> [  122.336346] 003:  schedule+0x38/0xd8
> [  122.336347] 003:  kvm_vcpu_block+0xac/0x790
> [  122.336349] 003:  kvm_handle_wfx+0x210/0x520
> [  122.336352] 003:  handle_exit+0x134/0x1d0
> [  122.336355] 003:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
> [  122.336357] 003:  kvm_vcpu_ioctl+0x3a0/0xae0
> [  122.336359] 003:  do_vfs_ioctl+0xbc/0x910
> [  122.336363] 003:  ksys_ioctl+0x78/0xa8
> [  122.336365] 003:  __arm64_sys_ioctl+0x1c/0x28
> [  122.336367] 003:  el0_svc_common.constprop.0+0x90/0x188
> [  122.336370] 003:  el0_svc_handler+0x28/0x78
> [  122.336373] 003:  el0_svc+0x8/0xc
> [  122.564216] 000: BUG: scheduling while atomic: kvm-vcpu-1/1430/0x00000002
> [  122.564221] 000: 2 locks held by kvm-vcpu-1/1430:
> [  122.564224] 000:  #0: ffff8007c1518100 (&vcpu->mutex){+.+.}, at: 
> kvm_vcpu_ioctl+0x70/0xae0
> [  122.564236] 000:  #1: ffff8007fb08b478 
> (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
> [  122.564245] 000: Modules linked in:
> [  122.564248] 000: Preemption disabled at:
> [  122.564249] 000: [<ffff0000111a44e8>] schedule+0x30/0xd8

[...]

> The first problem "BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:968" seem to be related to RT-specific commit 
> d628c3c56cab "hrtimer: Introduce expiry spin lock".
> 
>  From my understanding, the problem is the hrtimer_cancel() is called from a 
> preempt notifier and therefore preemption will be disabled. The patch mentioned 
> above will actually require hrtimer_cancel() to be called from preemptible context.
> 
> Do you have any thoughts how the problem should be addressed?

It really feels like a change in hrtimer_cancel semantics. From what I
understand, this is used to avoid racing against the softirq, but boy it
breaks things.

If this cannot be avoided, this means we can't cancel the background
timer (which is used to emulate the vcpu timer while it is blocked
waiting for an interrupt), then we must move this canceling to the point
where the vcpu is unblocked (instead of scheduled), which may have some
side effects -- I'll have a look.

But that's not the only problem: We also have hrtimers used to emulate
timers while the vcpu is running, and these timers are canceled in
kvm_timer_vcpu_put(), which is also called from a preempt notifier.
Unfortunately, I don't have a reasonable solution for that (other than
putting this hrtimer_cancel in a workqueue and start chasing the
resulting races).

Any other idea before I start tearing our timer code apart *again*?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux