On Mon, Jan 16, 2017 at 10:48 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > ----- Original Message ----- >> From: "Dmitry Vyukov" <dvyukov@xxxxxxxxxx> >> To: "Steve Rutherford" <srutherford@xxxxxxxxxx> >> Cc: "syzkaller" <syzkaller@xxxxxxxxxxxxxxxx>, "Paolo Bonzini" <pbonzini@xxxxxxxxxx>, "Radim Krčmář" >> <rkrcmar@xxxxxxxxxx>, "KVM list" <kvm@xxxxxxxxxxxxxxx>, "LKML" <linux-kernel@xxxxxxxxxxxxxxx> >> Sent: Monday, January 16, 2017 10:34:26 PM >> Subject: Re: kvm: use-after-free in process_srcu >> >> On Sun, Jan 15, 2017 at 6:11 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >> > On Fri, Jan 13, 2017 at 10:19 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >> >> On Fri, Jan 13, 2017 at 4:30 AM, Steve Rutherford >> >> <srutherford@xxxxxxxxxx> wrote: >> >>> I'm not that familiar with the kernel's workqueues, but this seems >> >>> like the classic "callback outlives the memory it references" >> >>> use-after-free, where the process_srcu callback is outliving struct >> >>> kvm (which contains the srcu_struct). If that's right, then calling >> >>> srcu_barrier (which should wait for all of the call_srcu callbacks to >> >>> complete, which are what enqueue the process_srcu callbacks) before >> >>> cleanup_srcu_struct in kvm_destroy_vm probably fixes this. >> >>> >> >>> The corresponding patch to virt/kvm/kvm_main.c looks something like: >> >>> static void kvm_destroy_vm(struct kvm *kvm) >> >>> { >> >>> ... >> >>> for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) >> >>> kvm_free_memslots(kvm, kvm->memslots[i]); >> >>> + srcu_barrier(&kvm->irq_srcu); >> >>> cleanup_srcu_struct(&kvm->irq_srcu); >> >>> + srcu_barrier(&kvm->srcu); >> >>> cleanup_srcu_struct(&kvm->srcu); >> >>> ... >> >>> >> >>> >> >>> Since we don't have a repro, this obviously won't be readily testable. >> >>> I find srcu subtle enough that I don't trust my reasoning fully (in >> >>> particular, I don't trust that waiting for all of the call_srcu >> >>> callbacks to complete also waits for all of the process_srcu >> >>> callbacks). Someone else know if that's the case? >> >> >> >> >> >> From the function description it looks like it should do the trick: >> >> >> >> 514 /** >> >> 515 * srcu_barrier - Wait until all in-flight call_srcu() callbacks >> >> complete. >> >> 516 * @sp: srcu_struct on which to wait for in-flight callbacks. >> >> 517 */ >> >> 518 void srcu_barrier(struct srcu_struct *sp) >> >> >> >> I see this failure happening several times per day. I've applied your >> >> patch locally and will check if I see these failures happening. >> > >> > >> > I have not seen the crash in 3 days, when usually I see several >> > crashes per night. So I think we can consider that the patch fixes the >> > crash: >> > >> > Tested-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> >> >> >> Unfortunately I hit it again with the patch applied. It definitely >> happens less frequently now, but still happens: > > Try this: > > diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c > index 9b9cdd549caa..ef5599c65299 100644 > --- a/kernel/rcu/srcu.c > +++ b/kernel/rcu/srcu.c > @@ -283,6 +283,7 @@ void cleanup_srcu_struct(struct srcu_struct *sp) > { > if (WARN_ON(srcu_readers_active(sp))) > return; /* Leakage unless caller handles error. */ > + flush_delayed_work(&sp->work); > free_percpu(sp->per_cpu_ref); > sp->per_cpu_ref = NULL; > } > > I think it should subsume Steve's patch, but I'm not 101% sure. We > will have to run this through Paul. Hi +Pual, I am seeing use-after-frees in process_srcu as struct srcu_struct is already freed. Before freeing struct srcu_struct, code does cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: + srcu_barrier(&kvm->irq_srcu); cleanup_srcu_struct(&kvm->irq_srcu); It reduced rate of use-after-frees, but did not eliminate them completely. The full threaded is here: https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ Does Paolo's fix above make sense to you? Namely adding flush_delayed_work(&sp->work) to cleanup_srcu_struct()? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html