On Tue, Jan 17, 2017 at 10:47 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > On Mon, Jan 16, 2017 at 10:48 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> >> >> ----- Original Message ----- >>> From: "Dmitry Vyukov" <dvyukov@xxxxxxxxxx> >>> To: "Steve Rutherford" <srutherford@xxxxxxxxxx> >>> Cc: "syzkaller" <syzkaller@xxxxxxxxxxxxxxxx>, "Paolo Bonzini" <pbonzini@xxxxxxxxxx>, "Radim Krčmář" >>> <rkrcmar@xxxxxxxxxx>, "KVM list" <kvm@xxxxxxxxxxxxxxx>, "LKML" <linux-kernel@xxxxxxxxxxxxxxx> >>> Sent: Monday, January 16, 2017 10:34:26 PM >>> Subject: Re: kvm: use-after-free in process_srcu >>> >>> On Sun, Jan 15, 2017 at 6:11 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >>> > On Fri, Jan 13, 2017 at 10:19 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >>> >> On Fri, Jan 13, 2017 at 4:30 AM, Steve Rutherford >>> >> <srutherford@xxxxxxxxxx> wrote: >>> >>> I'm not that familiar with the kernel's workqueues, but this seems >>> >>> like the classic "callback outlives the memory it references" >>> >>> use-after-free, where the process_srcu callback is outliving struct >>> >>> kvm (which contains the srcu_struct). If that's right, then calling >>> >>> srcu_barrier (which should wait for all of the call_srcu callbacks to >>> >>> complete, which are what enqueue the process_srcu callbacks) before >>> >>> cleanup_srcu_struct in kvm_destroy_vm probably fixes this. >>> >>> >>> >>> The corresponding patch to virt/kvm/kvm_main.c looks something like: >>> >>> static void kvm_destroy_vm(struct kvm *kvm) >>> >>> { >>> >>> ... >>> >>> for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) >>> >>> kvm_free_memslots(kvm, kvm->memslots[i]); >>> >>> + srcu_barrier(&kvm->irq_srcu); >>> >>> cleanup_srcu_struct(&kvm->irq_srcu); >>> >>> + srcu_barrier(&kvm->srcu); >>> >>> cleanup_srcu_struct(&kvm->srcu); >>> >>> ... >>> >>> >>> >>> >>> >>> Since we don't have a repro, this obviously won't be readily testable. >>> >>> I find srcu subtle enough that I don't trust my reasoning fully (in >>> >>> particular, I don't trust that waiting for all of the call_srcu >>> >>> callbacks to complete also waits for all of the process_srcu >>> >>> callbacks). Someone else know if that's the case? >>> >> >>> >> >>> >> From the function description it looks like it should do the trick: >>> >> >>> >> 514 /** >>> >> 515 * srcu_barrier - Wait until all in-flight call_srcu() callbacks >>> >> complete. >>> >> 516 * @sp: srcu_struct on which to wait for in-flight callbacks. >>> >> 517 */ >>> >> 518 void srcu_barrier(struct srcu_struct *sp) >>> >> >>> >> I see this failure happening several times per day. I've applied your >>> >> patch locally and will check if I see these failures happening. >>> > >>> > >>> > I have not seen the crash in 3 days, when usually I see several >>> > crashes per night. So I think we can consider that the patch fixes the >>> > crash: >>> > >>> > Tested-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> >>> >>> >>> Unfortunately I hit it again with the patch applied. It definitely >>> happens less frequently now, but still happens: >> >> Try this: >> >> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c >> index 9b9cdd549caa..ef5599c65299 100644 >> --- a/kernel/rcu/srcu.c >> +++ b/kernel/rcu/srcu.c >> @@ -283,6 +283,7 @@ void cleanup_srcu_struct(struct srcu_struct *sp) >> { >> if (WARN_ON(srcu_readers_active(sp))) >> return; /* Leakage unless caller handles error. */ >> + flush_delayed_work(&sp->work); >> free_percpu(sp->per_cpu_ref); >> sp->per_cpu_ref = NULL; >> } >> >> I think it should subsume Steve's patch, but I'm not 101% sure. We >> will have to run this through Paul. > > > Hi +Pual, > > I am seeing use-after-frees in process_srcu as struct srcu_struct is > already freed. Before freeing struct srcu_struct, code does > cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: > > + srcu_barrier(&kvm->irq_srcu); > cleanup_srcu_struct(&kvm->irq_srcu); > > It reduced rate of use-after-frees, but did not eliminate them > completely. The full threaded is here: > https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ > > Does Paolo's fix above make sense to you? Namely adding > flush_delayed_work(&sp->work) to cleanup_srcu_struct()? I am not sure about interaction of flush_delayed_work and srcu_reschedule... flush_delayed_work probably assumes that no work is queued concurrently, but what if srcu_reschedule queues another work concurrently... can't it happen that flush_delayed_work will miss that newly scheduled work? Meanwhile I will apply this change instead of Steve's change and see what happens. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html