On Tue, Jan 17, 2017 at 12:08 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > On 17/01/2017 10:56, Dmitry Vyukov wrote: >>> I am seeing use-after-frees in process_srcu as struct srcu_struct is >>> already freed. Before freeing struct srcu_struct, code does >>> cleanup_srcu_struct(&kvm->irq_srcu). We also tried to do: >>> >>> + srcu_barrier(&kvm->irq_srcu); >>> cleanup_srcu_struct(&kvm->irq_srcu); >>> >>> It reduced rate of use-after-frees, but did not eliminate them >>> completely. The full threaded is here: >>> https://groups.google.com/forum/#!msg/syzkaller/i48YZ8mwePY/0PQ8GkQTBwAJ >>> >>> Does Paolo's fix above make sense to you? Namely adding >>> flush_delayed_work(&sp->work) to cleanup_srcu_struct()? >> >> I am not sure about interaction of flush_delayed_work and >> srcu_reschedule... flush_delayed_work probably assumes that no work is >> queued concurrently, but what if srcu_reschedule queues another work >> concurrently... can't it happen that flush_delayed_work will miss that >> newly scheduled work? > > Newly scheduled callbacks would be a bug in SRCU usage, but my patch is I mean not srcu callbacks, but the sp->work being rescheduled. Consider that callbacks are already scheduled. We call flush_delayed_work, it waits for completion of process_srcu. But that process_srcu schedules sp->work again in srcu_reschedule. > indeed insufficient. Because of SRCU's two-phase algorithm, it's possible > that the first flush_delayed_work doesn't invoke all callbacks. Instead I > would propose this (still untested, but this time with a commit message): > > ---------------- 8< -------------- > From: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Subject: [PATCH] srcu: wait for all callbacks before deeming SRCU "cleaned up" > > Even though there are no concurrent readers, it is possible that the > work item is queued for delayed processing when cleanup_srcu_struct is > called. The work item needs to be flushed before returning, or a > use-after-free can ensue. > > Furthermore, because of SRCU's two-phase algorithm it may take up to > two executions of srcu_advance_batches before all callbacks are invoked. > This can happen if the first flush_delayed_work happens as follows > > srcu_read_lock > process_srcu > srcu_advance_batches > ... > if (!try_check_zero(sp, idx^1, trycount)) > // there is a reader > return; > srcu_invoke_callbacks > ... > srcu_read_unlock > cleanup_srcu_struct > flush_delayed_work > srcu_reschedule > queue_delayed_work > > Now flush_delayed_work returns but srcu_reschedule will *not* have cleared > sp->running to false. > > Not-tested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c > index 9b9cdd549caa..9470f1ba2ef2 100644 > --- a/kernel/rcu/srcu.c > +++ b/kernel/rcu/srcu.c > @@ -283,6 +283,14 @@ void cleanup_srcu_struct(struct srcu_struct *sp) > { > if (WARN_ON(srcu_readers_active(sp))) > return; /* Leakage unless caller handles error. */ > + > + /* > + * No readers active, so any pending callbacks will rush through the two > + * batches before sp->running becomes false. No risk of busy-waiting. > + */ > + while (sp->running) > + flush_delayed_work(&sp->work); Unsynchronized accesses to shared state make me nervous. running is meant to be protected with sp->queue_lock. At least we will get back to you with a KTSAN report. > free_percpu(sp->per_cpu_ref); > sp->per_cpu_ref = NULL; > } > > > Thanks, > > Paolo > > -- > You received this message because you are subscribed to the Google Groups "syzkaller" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@xxxxxxxxxxxxxxxx. > For more options, visit https://groups.google.com/d/optout. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html