On 17/01/2017 12:13, Dmitry Vyukov wrote: > On Tue, Jan 17, 2017 at 12:08 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> >> >> On 17/01/2017 10:56, Dmitry Vyukov wrote: >>>> I am seeing use-after-frees in process_srcu as struct srcu_struct is >>>> already freed. Before freeing struct srcu_struct, code does >>>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: >>>> >>>> + srcu_barrier(&kvm->irq_srcu); >>>> cleanup_srcu_struct(&kvm->irq_srcu); >>>> >>>> It reduced rate of use-after-frees, but did not eliminate them >>>> completely. The full threaded is here: >>>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ >>>> >>>> Does Paolo's fix above make sense to you? Namely adding >>>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()? >>> >>> I am not sure about interaction of flush_delayed_work and >>> srcu_reschedule... flush_delayed_work probably assumes that no work is >>> queued concurrently, but what if srcu_reschedule queues another work >>> concurrently... can't it happen that flush_delayed_work will miss that >>> newly scheduled work? >> >> Newly scheduled callbacks would be a bug in SRCU usage, but my patch is > > I mean not srcu callbacks, but the sp->work being rescheduled. > Consider that callbacks are already scheduled. We call > flush_delayed_work, it waits for completion of process_srcu. But that > process_srcu schedules sp->work again in srcu_reschedule. > > >> indeed insufficient. Because of SRCU's two-phase algorithm, it's possible >> that the first flush_delayed_work doesn't invoke all callbacks. Instead I >> would propose this (still untested, but this time with a commit message): >> >> ---------------- 8< -------------- >> From: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up" >> >> Even though there are no concurrent readers, it is possible that the >> work item is queued for delayed processing when cleanup_srcu_struct is >> called. The work item needs to be flushed before returning, or a >> use-after-free can ensue. >> >> Furthermore, because of SRCU's two-phase algorithm it may take up to >> two executions of srcu_advance_batches before all callbacks are invoked. >> This can happen if the first flush_delayed_work happens as follows >> >> srcu_read_lock >> process_srcu >> srcu_advance_batches >> ... >> if (!try_check_zero(sp, idx^1, trycount)) >> // there is a reader >> return; >> srcu_invoke_callbacks >> ... >> srcu_read_unlock >> cleanup_srcu_struct >> flush_delayed_work >> srcu_reschedule >> queue_delayed_work >> >> Now flush_delayed_work returns but srcu_reschedule will *not* have cleared >> sp->running to false. >> >> Not-tested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> >> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c >> index 9b9cdd549caa..9470f1ba2ef2 100644 >> --- a/kernel/rcu/srcu.c >> +++ b/kernel/rcu/srcu.c >> @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp) >> { >> if (WARN_ON(srcu_readers_active(sp))) >> return; /* Leakage unless caller handles error. */ >> + >> + /* >> + * No readers active, so any pending callbacks will rush through the two >> + * batches before sp->running becomes false. No risk of busy-waiting. >> + */ >> + while (sp->running) >> + flush_delayed_work(&sp->work); > > Unsynchronized accesses to shared state make me nervous. running is > meant to be protected with sp->queue_lock. I think it could just be while (flush_delayed_work(&sp->work)); but let's wait for Paul. Paolo > At least we will get back to you with a KTSAN report. > >> free_percpu(sp->per_cpu_ref); >> sp->per_cpu_ref = NULL; >> } >> >> >> Thanks, >> >> Paolo >> >> -- >> You received this message because you are subscribed to the Google Groups "syzkaller" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@xxxxxxxxxxxxxxxx. >> For more options, visit https://groups.google.com/d/optout. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html