On Tue, Jan 17, 2017 at 01:03:28PM +0100, Paolo Bonzini wrote: > > > On 17/01/2017 12:13, Dmitry Vyukov wrote: > > On Tue, Jan 17, 2017 at 12:08 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > >> > >> > >> On 17/01/2017 10:56, Dmitry Vyukov wrote: > >>>> I am seeing use-after-frees in process_srcu as struct srcu_struct is > >>>> already freed. Before freeing struct srcu_struct, code does > >>>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: > >>>> > >>>> + srcu_barrier(&kvm->irq_srcu); > >>>> cleanup_srcu_struct(&kvm->irq_srcu); > >>>> > >>>> It reduced rate of use-after-frees, but did not eliminate them > >>>> completely. The full threaded is here: > >>>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ > >>>> > >>>> Does Paolo's fix above make sense to you? Namely adding > >>>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()? Yes, we do need a flush_delayed_work(), good catch! But doing multiple of them should not be necessary because there shouldn't be any callbacks at all once the srcu_barrier() returns, and the only time SRCU queues more work is if there is at least one callback pending. The code is making sure that no new call_srcu() invocations happen before it does the srcu_barrier(), right? So if you are seing failures even with the single flush_delayed_work(), it would be interesting to set a flag in the srcu_struct at cleanup_srcu_struct time, and then splat if srcu_reschedule() does its queue_delayed_work() when that flag is set. > >>> I am not sure about interaction of flush_delayed_work and > >>> srcu_reschedule... flush_delayed_work probably assumes that no work is > >>> queued concurrently, but what if srcu_reschedule queues another work > >>> concurrently... can't it happen that flush_delayed_work will miss that > >>> newly scheduled work? > >> > >> Newly scheduled callbacks would be a bug in SRCU usage, but my patch is > > > > I mean not srcu callbacks, but the sp->work being rescheduled. > > Consider that callbacks are already scheduled. We call > > flush_delayed_work, it waits for completion of process_srcu. But that > > process_srcu schedules sp->work again in srcu_reschedule. It only does this if there are callbacks still on the srcu_struct, so if you are seeing this, we either have a bug in SRCU that finds callbacks when none are present or we have a usage bug that is creating new callbacks after src_barrier() starts. Do any of your callback functions invoke call_srcu()? (Hey, I have to ask!) > >> indeed insufficient. Because of SRCU's two-phase algorithm, it's possible > >> that the first flush_delayed_work doesn't invoke all callbacks. Instead I > >> would propose this (still untested, but this time with a commit message): > >> > >> ---------------- 8< -------------- > >> From: Paolo Bonzini <pbonzini@xxxxxxxxxx> > >> Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up" > >> > >> Even though there are no concurrent readers, it is possible that the > >> work item is queued for delayed processing when cleanup_srcu_struct is > >> called. The work item needs to be flushed before returning, or a > >> use-after-free can ensue. > >> > >> Furthermore, because of SRCU's two-phase algorithm it may take up to > >> two executions of srcu_advance_batches before all callbacks are invoked. > >> This can happen if the first flush_delayed_work happens as follows > >> > >> srcu_read_lock > >> process_srcu > >> srcu_advance_batches > >> ... > >> if (!try_check_zero(sp, idx^1, trycount)) > >> // there is a reader > >> return; > >> srcu_invoke_callbacks > >> ... > >> srcu_read_unlock > >> cleanup_srcu_struct > >> flush_delayed_work > >> srcu_reschedule > >> queue_delayed_work > >> > >> Now flush_delayed_work returns but srcu_reschedule will *not* have cleared > >> sp->running to false. But srcu_reschedule() sets sp->running to false if there are no callbacks. And at that point, there had better be no callbacks. > >> Not-tested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > >> > >> diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c > >> index 9b9cdd549caa..9470f1ba2ef2 100644 > >> --- a/kernel/rcu/srcu.c > >> +++ b/kernel/rcu/srcu.c > >> @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp) > >> { > >> if (WARN_ON(srcu_readers_active(sp))) > >> return; /* Leakage unless caller handles error. */ > >> + > >> + /* > >> + * No readers active, so any pending callbacks will rush through the two > >> + * batches before sp->running becomes false. No risk of busy-waiting. > >> + */ > >> + while (sp->running) > >> + flush_delayed_work(&sp->work); > > > > Unsynchronized accesses to shared state make me nervous. running is > > meant to be protected with sp->queue_lock. > > I think it could just be > > while (flush_delayed_work(&sp->work)); > > but let's wait for Paul. If it needs to be more than just a single flush_delayed_work(), we have some other bug somewhere. ;-) Thanx, Paul > Paolo > > > At least we will get back to you with a KTSAN report. > > > >> free_percpu(sp->per_cpu_ref); > >> sp->per_cpu_ref = NULL; > >> } > >> > >> > >> Thanks, > >> > >> Paolo > >> > >> -- > >> You received this message because you are subscribed to the Google Groups "syzkaller" group. > >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@xxxxxxxxxxxxxxxx. > >> For more options, visit https://groups.google.com/d/optout. > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html