On Mon, Oct 26, 2009 at 12:21:57PM -0400, Gregory Haskins wrote: > The current code suffers from the following race condition: > > thread-1 thread-2 > ----------------------------------------------------------- > > kvm_set_irq() { > rcu_read_lock() > irq_rt = rcu_dereference(table); > rcu_read_unlock(); > > kvm_set_irq_routing() { > mutex_lock(); > irq_rt = table; > rcu_assign_pointer(); > mutex_unlock(); > synchronize_rcu(); > > kfree(irq_rt); > > irq_rt->entry->set(); /* bad */ > > ------------------------------------------------------------- > > Because the pointer is accessed outside of the read-side critical > section. There are two basic patterns we can use to fix this bug: > > 1) Switch to sleeping-rcu and encompass the ->set() access within the > read-side critical section, > > OR > > 2) Add reference counting to the irq_rt structure, and simply acquire > the reference from within the RSCS. > > This patch implements solution (1). Looks like a good transformation! A few questions interspersed below. > Signed-off-by: Gregory Haskins <ghaskins@xxxxxxxxxx> > --- > > include/linux/kvm_host.h | 6 +++++- > virt/kvm/irq_comm.c | 50 +++++++++++++++++++++++++++------------------- > virt/kvm/kvm_main.c | 1 + > 3 files changed, 35 insertions(+), 22 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index bd5a616..1fe135d 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -185,7 +185,10 @@ struct kvm { > > struct mutex irq_lock; > #ifdef CONFIG_HAVE_KVM_IRQCHIP > - struct kvm_irq_routing_table *irq_routing; > + struct { > + struct srcu_struct srcu; Each structure has its own SRCU domain. This is OK, but just asking if that is the intent. It does look like the SRCU primitives are passed a pointer to the correct structure, and that the return value from srcu_read_lock() gets passed into the matching srcu_read_unlock() like it needs to be, so that is good. > + struct kvm_irq_routing_table *table; > + } irq_routing; > struct hlist_head mask_notifier_list; > struct hlist_head irq_ack_notifier_list; > #endif [ . . . ] > @@ -155,21 +156,19 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level) > * IOAPIC. So set the bit in both. The guest will ignore > * writes to the unused one. > */ > - rcu_read_lock(); > - irq_rt = rcu_dereference(kvm->irq_routing); > + idx = srcu_read_lock(&kvm->irq_routing.srcu); > + irq_rt = rcu_dereference(kvm->irq_routing.table); > if (irq < irq_rt->nr_rt_entries) > - hlist_for_each_entry(e, n, &irq_rt->map[irq], link) > - irq_set[i++] = *e; > - rcu_read_unlock(); > + hlist_for_each_entry(e, n, &irq_rt->map[irq], link) { What prevents the above list from changing while we are traversing it? (Yes, presumably whatever was preventing it from changing before this patch, but what?) Mostly kvm->lock is held, but not always. And if kvm->lock were held all the time, there would be no point in using SRCU. ;-) > + int r; > > - while(i--) { > - int r; > - r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level); > - if (r < 0) > - continue; > + r = e->set(e, kvm, irq_source_id, level); > + if (r < 0) > + continue; > > - ret = r + ((ret < 0) ? 0 : ret); > - } > + ret = r + ((ret < 0) ? 0 : ret); > + } > + srcu_read_unlock(&kvm->irq_routing.srcu, idx); > > return ret; > } > @@ -179,17 +178,18 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin) > struct kvm_irq_ack_notifier *kian; > struct hlist_node *n; > int gsi; > + int idx; > > trace_kvm_ack_irq(irqchip, pin); > > - rcu_read_lock(); > - gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin]; > + idx = srcu_read_lock(&kvm->irq_routing.srcu); > + gsi = rcu_dereference(kvm->irq_routing.table)->chip[irqchip][pin]; > if (gsi != -1) > hlist_for_each_entry_rcu(kian, n, &kvm->irq_ack_notifier_list, > link) And same question here -- what keeps the above list from changing while we are traversing it? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html