Re: KCSAN + KVM = host reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Apr 9, 2020, at 5:28 PM, Qian Cai <cai@xxxxxx> wrote:
> 
> 
> 
>> On Apr 9, 2020, at 12:03 PM, Marco Elver <elver@xxxxxxxxxx> wrote:
>> 
>> On Thu, 9 Apr 2020 at 17:30, Qian Cai <cai@xxxxxx> wrote:
>>> 
>>> 
>>> 
>>>> On Apr 9, 2020, at 11:22 AM, Marco Elver <elver@xxxxxxxxxx> wrote:
>>>> 
>>>> On Thu, 9 Apr 2020 at 17:10, Qian Cai <cai@xxxxxx> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Apr 9, 2020, at 3:03 AM, Marco Elver <elver@xxxxxxxxxx> wrote:
>>>>>> 
>>>>>> On Wed, 8 Apr 2020 at 23:29, Qian Cai <cai@xxxxxx> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Apr 8, 2020, at 5:25 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>>> On 08/04/20 22:59, Qian Cai wrote:
>>>>>>>>> Running a simple thing on this AMD host would trigger a reset right away.
>>>>>>>>> Unselect KCSAN kconfig makes everything work fine (the host would also
>>>>>>>>> reset If only "echo off > /sys/kernel/debug/kcsan” before running qemu-kvm).
>>>>>>>> 
>>>>>>>> Is this a regression or something you've just started to play with?  (If
>>>>>>>> anything, the assembly language conversion of the AMD world switch that
>>>>>>>> is in linux-next could have reduced the likelihood of such a failure,
>>>>>>>> not increased it).
>>>>>>> 
>>>>>>> I don’t remember I had tried this combination before, so don’t know if it is a
>>>>>>> regression or not.
>>>>>> 
>>>>>> What happens with KASAN? My guess is that, since it also happens with
>>>>>> "off", something that should not be instrumented is being
>>>>>> instrumented.
>>>>> 
>>>>> No, KASAN + KVM works fine.
>>>>> 
>>>>>> 
>>>>>> What happens if you put a 'KCSAN_SANITIZE := n' into
>>>>>> arch/x86/kvm/Makefile? Since it's hard for me to reproduce on this
>>>>> 
>>>>> Yes, that works, but this below alone does not work,
>>>>> 
>>>>> KCSAN_SANITIZE_kvm-amd.o := n
>>>> 
>>>> There are some other files as well, that you could try until you hit
>>>> the right one.
>>>> 
>>>> But since this is in arch, 'KCSAN_SANITIZE := n' wouldn't be too bad
>>>> for now. If you can't narrow it down further, do you want to send a
>>>> patch?
>>> 
>>> No, that would be pretty bad because it will disable KCSAN for Intel
>>> KVM as well which is working perfectly fine right now. It is only AMD
>>> is broken.
>> 
>> Interesting. Unfortunately I don't have access to an AMD machine right now.
>> 
>> Actually I think it should be:
>> 
>> KCSAN_SANITIZE_svm.o := n
>> KCSAN_SANITIZE_pmu_amd.o := n
>> 
>> If you want to disable KCSAN for kvm-amd.
> 
> KCSAN_SANITIZE_svm.o := n
> 
> That alone works fine. I am wondering which functions there could trigger
> perhaps some kind of recursing with KCSAN?

Another data point is set CONFIG_KCSAN_INTERRUPT_WATCHER=n alone
also fixed the issue. I saw quite a few interrupt related function in svm.c, so
some interrupt-related recursion going on?



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux