On Tue, May 24, 2022 at 02:42:04PM +0800, Guoqing Jiang wrote: > From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > > Backport of commit 2f15d027c05fac406decdb5eceb9ec0902b68f53 upstream. > > Async PF 'page ready' event may happen when LAPIC is (temporary) disabled. > In particular, Sebastien reports that when Linux kernel is directly booted > by Cloud Hypervisor, LAPIC is 'software disabled' when APF mechanism is > initialized. On initialization KVM tries to inject 'wakeup all' event and > puts the corresponding token to the slot. It is, however, failing to inject > an interrupt (kvm_apic_set_irq() -> __apic_accept_irq() -> !apic_enabled()) > so the guest never gets notified and the whole APF mechanism gets stuck. > The same issue is likely to happen if the guest temporary disables LAPIC > and a previously unavailable page becomes available. > > Do two things to resolve the issue: > - Avoid dequeuing 'page ready' events from APF queue when LAPIC is > disabled. > - Trigger an attempt to deliver pending 'page ready' events when LAPIC > becomes enabled (SPIV or MSR_IA32_APICBASE). > > Reported-by: Sebastien Boeuf <sebastien.boeuf@xxxxxxxxx> > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > Message-Id: <20210422092948.568327-1-vkuznets@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > [Guoqing: backport to 5.10-stable ] > Signed-off-by: Guoqing Jiang <guoqing.jiang@xxxxxxxxx> > --- > Hi, > > We encountered below task hang issue with 5.10.113 stable kernel. > > [ 246.845061] INFO: task rpmbuild:2303 blocked for more than 122 seconds. > [ 246.846269] Not tainted 5.10.113-1.1.se2-default #1 > [ 246.847103] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.848248] task:rpmbuild state:D stack: 0 pid: 2303 ppid: 2302 flags:0x00000000 > [ 246.848252] Call Trace: > [ 246.848266] __schedule+0x3f6/0x7c0 > [ 246.848289] ? __handle_mm_fault+0x3dd/0x6d0 > [ 246.848291] schedule+0x46/0xb0 > [ 246.848295] kvm_async_pf_task_wait_schedule+0x4b/0x90 > [ 246.848297] ? handle_mm_fault+0xbc/0x280 > [ 246.848300] __kvm_handle_async_pf+0x4f/0xb0 > [ 246.848303] exc_page_fault+0x204/0x540 > [ 246.848305] ? asm_exc_page_fault+0x8/0x30 > [ 246.848307] asm_exc_page_fault+0x1e/0x30 > [ 246.848310] RIP: 0033:0x7f122fbdfc90 > > And after investigating, this patch resolve the issue. 5.12 stable kernel > has already merged it by commit 36825931c607. Now queued up, thanks. greg k-h