[RFC] SVM: L2 hang with fresh L1 and old L0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear friends,

Recently, we (in OpenVZ) noticed an interesting issue with
L2 VM hang on RHEL 7 based hosts with SVM (AMD).

Let me describe our test configuration:
- AMD EPYC 7443P (Milan) or AMD EPYC 7261 (Rome)
- RHEL 7 based kernel on the Host Node.
... and most important:

L0 -----------> L1 --------> L2
RHEL 7       -> RHEL 7 --------> RHEL 7        *works*
RHEL 7       -> RHEL 7 --------> RHEL 8        *works*
RHEL 7       -> RHEL 7 --------> recent Fedora *works*
RHEL 7       -> RHEL 8 --------> RHEL 7        *L2 hang*
RHEL 7       -> fresh Fedora --> RHEL 7        *L2 hang*

or even more:
RHEL 7       -> RHEL 7 --------> *any tested Linux guest*  *works*
RHEL 7       -> RHEL 8 --------> *any tested Linux guest*  *L2 hang*

but at the same time:
RHEL 8       -> RHEL 8 --------> *any tested Linux guest*  *works*

It was the key observation and I've started bisecting L1 kernel to find
some hint. It was commit:
c9d40913 ("KVM: x86: enable event window in inject_pending_event")

At the same minute I've tried to revert it for CentOS 8 kernel and retry test,
and it... works! To conclude, if we have an *old* kernel on host and *sufficiently new* kernel
in L1 then L2 totaly broken (only for SVM).

I've tried to port this patch for L0 kernel and check if it will fix the issue. And yes,
it works. I wonder if it will be useful information for KVM developers and users.

My attempt to port it for RHEL 7 kernel:
https://lists.openvz.org/pipermail/devel/2022-June/079776.html

Possibly I need to port this patches for stable kernels too and send it?

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v4.9.320&qt=grep&q=enable+event+window+in+inject_pending_event
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v4.14.285&qt=grep&q=enable+event+window+in+inject_pending_event
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v4.19.249&qt=grep&q=enable+event+window+in+inject_pending_event
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v5.4.201&qt=grep&q=enable+event+window+in+inject_pending_event

So, 4.9, 4.14, 4.19 and 5.4 kernels lacks this patch.

I've not checked that yet but it looks like, for instance,

L0  -> L1   -> L2
5.4 -> 5.10 -> *any kernel version*

setup will hang for SVM.

Regards,
    Alex



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux