On Fri, 07 Feb 2025 13:21:44 +0000, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > On Fri, Feb 07, 2025 at 12:27:51PM +0000, Will Deacon wrote: > > On Thu, Feb 06, 2025 at 02:10:55PM +0000, Mark Rutland wrote: > > > There are several problems with the way hyp code lazily saves the host's > > > FPSIMD/SVE state, including: > > > > > > * Host SVE being discarded unexpectedly due to inconsistent > > > configuration of TIF_SVE and CPACR_ELx.ZEN. This has been seen to > > > result in QEMU crashes where SVE is used by memmove(), as reported by > > > Eric Auger: > > > > > > https://issues.redhat.com/browse/RHEL-68997 > > > > > > * Host SVE state is discarded *after* modification by ptrace, which was an > > > unintentional ptrace ABI change introduced with lazy discarding of SVE state. > > > > > > * The host FPMR value can be discarded when running a non-protected VM, > > > where FPMR support is not exposed to a VM, and that VM uses > > > FPSIMD/SVE. In these cases the hyp code does not save the host's FPMR > > > before unbinding the host's FPSIMD/SVE/SME state, leaving a stale > > > value in memory. > > > > How hard would it be to write tests for these three scenarios? If we > > had something to exercise the relevant paths then... > > > > > ... and so this eager save+flush probably needs to be backported to ALL > > > stable trees. > > > > ... this backporting might be a little easier to be sure about? > > For the first case I have a quick and dirty test, which I've pushed to > my arm64/kvm/fpsimd-tests branch in my kernel.org repo: > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/ > git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git > > For the last case it should be possible to do something similar, but I > hadn't had the time to dig in to the KVM selftests infrastructure and > figure out how to confiugre the guest appropriately. > > For the ptrace case, the same symptoms can be provoked outside of KVM > (and I'm currently working to fix that). From my PoV the important thing > is that this fix happens to remove KVM from the set of cases the other > fixes need to care about. > > FWIW I was assuming that I'd be handling the upstream backports, and I'd > be testing with the test above and some additional assertions hacked > into the kernel for testing. I agree that having the tests around would be great, if only to catch potential repressions. However, I really don't want to gate the fixes on these tests. So unless someone shouts, I intend to take this series in very shortly. We can always merge the tests as a subsequent improvement. Thanks, M. -- Without deviation from the norm, progress is not possible.