On Fri, 10 Apr 2020 at 01:00, Qian Cai <cai@xxxxxx> wrote: > > > > > On Apr 9, 2020, at 5:28 PM, Qian Cai <cai@xxxxxx> wrote: > > > > > > > >> On Apr 9, 2020, at 12:03 PM, Marco Elver <elver@xxxxxxxxxx> wrote: > >> > >> On Thu, 9 Apr 2020 at 17:30, Qian Cai <cai@xxxxxx> wrote: > >>> > >>> > >>> > >>>> On Apr 9, 2020, at 11:22 AM, Marco Elver <elver@xxxxxxxxxx> wrote: > >>>> > >>>> On Thu, 9 Apr 2020 at 17:10, Qian Cai <cai@xxxxxx> wrote: > >>>>> > >>>>> > >>>>> > >>>>>> On Apr 9, 2020, at 3:03 AM, Marco Elver <elver@xxxxxxxxxx> wrote: > >>>>>> > >>>>>> On Wed, 8 Apr 2020 at 23:29, Qian Cai <cai@xxxxxx> wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> On Apr 8, 2020, at 5:25 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > >>>>>>>> > >>>>>>>> On 08/04/20 22:59, Qian Cai wrote: > >>>>>>>>> Running a simple thing on this AMD host would trigger a reset right away. > >>>>>>>>> Unselect KCSAN kconfig makes everything work fine (the host would also > >>>>>>>>> reset If only "echo off > /sys/kernel/debug/kcsan” before running qemu-kvm). > >>>>>>>> > >>>>>>>> Is this a regression or something you've just started to play with? (If > >>>>>>>> anything, the assembly language conversion of the AMD world switch that > >>>>>>>> is in linux-next could have reduced the likelihood of such a failure, > >>>>>>>> not increased it). > >>>>>>> > >>>>>>> I don’t remember I had tried this combination before, so don’t know if it is a > >>>>>>> regression or not. > >>>>>> > >>>>>> What happens with KASAN? My guess is that, since it also happens with > >>>>>> "off", something that should not be instrumented is being > >>>>>> instrumented. > >>>>> > >>>>> No, KASAN + KVM works fine. > >>>>> > >>>>>> > >>>>>> What happens if you put a 'KCSAN_SANITIZE := n' into > >>>>>> arch/x86/kvm/Makefile? Since it's hard for me to reproduce on this > >>>>> > >>>>> Yes, that works, but this below alone does not work, > >>>>> > >>>>> KCSAN_SANITIZE_kvm-amd.o := n > >>>> > >>>> There are some other files as well, that you could try until you hit > >>>> the right one. > >>>> > >>>> But since this is in arch, 'KCSAN_SANITIZE := n' wouldn't be too bad > >>>> for now. If you can't narrow it down further, do you want to send a > >>>> patch? > >>> > >>> No, that would be pretty bad because it will disable KCSAN for Intel > >>> KVM as well which is working perfectly fine right now. It is only AMD > >>> is broken. > >> > >> Interesting. Unfortunately I don't have access to an AMD machine right now. > >> > >> Actually I think it should be: > >> > >> KCSAN_SANITIZE_svm.o := n > >> KCSAN_SANITIZE_pmu_amd.o := n > >> > >> If you want to disable KCSAN for kvm-amd. > > > > KCSAN_SANITIZE_svm.o := n > > > > That alone works fine. I am wondering which functions there could trigger > > perhaps some kind of recursing with KCSAN? > > Another data point is set CONFIG_KCSAN_INTERRUPT_WATCHER=n alone > also fixed the issue. I saw quite a few interrupt related function in svm.c, so > some interrupt-related recursion going on? That would contradict what you said about it working if KCSAN is "off". What kernel are you attempting to use in the VM? Thanks, -- Marco