Re: [PATCH 1/3] kvm tools: Make virtio-net kvm__irq_line thread safe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Asias He <asias.hejun@xxxxxxxxx> wrote:

> On 04/29/2011 03:30 PM, Ingo Molnar wrote:
> > 
> > * Asias He <asias.hejun@xxxxxxxxx> wrote:
> > 
> >> This patch fixes virtio-net randmom stall
> >>
> >> host $ scp guest:/root/big.guest .
> >> big.guest 42%  440MB  67.7KB/s - stalled -
> > 
> >> @@ -77,9 +77,12 @@ static void virtio_net_rx_callback(struct kvm *self, void *param)
> >>  		head = virt_queue__get_iov(vq, iov, &out, &in, self);
> >>  		len = readv(net_device.tap_fd, iov, in);
> >>  		virt_queue__set_used_elem(vq, head, len);
> >> -	}
> >>  
> >> -	kvm__irq_line(self, VIRTIO_NET_IRQ, 1);
> >> +		/* We should interrupt guest right now, otherwise latency is huge. */
> >> +		mutex_lock(&net_device.mutex);
> >> +		kvm__irq_line(self, VIRTIO_NET_IRQ, 1);
> >> +		mutex_unlock(&net_device.mutex);
> > 
> > The movement of the irq-line assertion call (which fixes the hangs) should be 
> > separated from the mutex_lock() additions (which look unnecessary).
> 
> I've sent out another patch to move this irq assertion.
> 
> > 
> > Could you please check whether the addition of the mutex really is necessary to 
> > solve the hang/stall?
> 
> 
> Without the mutex around the kvm__irq_line, I am still seeing the
> hangs(with the irq assertion movement patch).

since the mutex has no effect on the other party that modifies the virtual 
queue (the guest OS), this at best only changes timings and hides the bug.

The queues are private to the worker thread they belong to - and are shared 
with the guest OS. So we need to document the queueing protocol, who does what, 
when, and then maybe it will become more obvious why the hangs are occuring.

For example what happens when the RX queue gets full? The rx thread:

        while (virt_queue__available(vq)) {

drops out of the TAP-read() loop and sleeps indefinitely. It will only be woken 
up if the guest signals that the queue is available again (there's free slots 
in it):

        case VIRTIO_PCI_QUEUE_NOTIFY: {
                uint16_t queue_index;
                queue_index     = ioport__read16(data);
                virtio_net_handle_callback(self, queue_index);
                break;
        }

...
static void virtio_net_handle_callback(struct kvm *self, uint16_t queue_index)
{
        thread_pool__signal_work(net_device.jobs[queue_index]);
}

with queue_index == the RX vring.

Now if this is correct then virtio_net_rx_callback() is only ever be called 
with a ring not full. You could test this by adding an assert like this to the 
head of that function:

	BUG_ON(!virt_queue__available(vq))

does this BUG_ON() ever trigger?

Pick up BUG_ON() from tools/perf/:

  tools/perf/util/include/linux/kernel.h:#define BUG_ON(cond) assert(!(cond))

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux