Re: [PATCH 1/3] kvm tools: Make virtio-net kvm__irq_line thread safe

Ingo Molnar <mingo@xxxxxxx> · Fri, 29 Apr 2011 12:22:45 +0200

* Sasha Levin <levinsasha928@xxxxxxxxx> wrote:

> On Fri, 2011-04-29 at 15:13 +0800, Asias He wrote:
> > On 04/29/2011 02:52 PM, Pekka Enberg wrote:
> > > Please make that IRQ latency fix a separate patch. Don't we need to do
> > > it for TX path as well, btw?
> > > 
> > 
> > No. We only need it in RX path. Sasha's threadpool patch breaks this.
> > I'm just moving it back.
> > 
> 
> I've moved the kvm__irq_line() call out because what would happen
> sometimes is the following:
>  - Call kvm__irq_line() to signal completion.
>  - virt_queue__available() would return true.
>  - readv() call blocks.
> 
> I figured it happens because we catch virt_queue in while it's being
> updated by the guest and use an erroneous state of it.

How can the vring.avail flag be 'erroneous'?

Can virt_queue__available() really be true while it's not really true? How is 
that possible and what are the semantics / expected behavior of interaction 
with the virtual ring?

It's this code AFAICS:

        if (!vq->vring.avail)
                return 0;
        return vq->vring.avail->idx !=  vq->last_avail_idx;

And it's used like this:

static void virtio_net_rx_callback(struct kvm *self, void *param)
{
        struct iovec iov[VIRTIO_NET_QUEUE_SIZE];
        struct virt_queue *vq;
        uint16_t out, in;
        uint16_t head;
        int len;

        vq = param;

        while (virt_queue__available(vq)) {
                head = virt_queue__get_iov(vq, iov, &out, &in, self);
                len = readv(net_device.tap_fd, iov, in);
                virt_queue__set_used_elem(vq, head, len);
        }

        kvm__irq_line(self, VIRTIO_NET_IRQ, 1);

the guest kernel has access to these same virt_queue structure and updates it - 
possibly in parallel to the virtio-net thread.

( One thing that sticks out is that there are no memory barriers used - 
  although very likely that is not needed right now and it is not related to 
  the stalls. )

the rx_callback() is one where the host receives packet from the TAP device, 
right? I.e. the host receives brand new packets here and forwards them to the 
guest.

If that is so then indeed the right approach might be to signal the guest every 
time we manage to readv() something - there might not come any other packet for 
a long time. But the reason is not some 'erroneous state' - all state is 
perfectly fine, this is simply a basic property of the event loop that the rx 
thread implements ...

And no, the mutex lock is not needed around the guest signalling - why would it 
be needed?

The thing is, the queueing and state machine code within virtio-net.c is rather 
hard to read - do people actually understand it? It took me several tried until 
i convinced mysef that with this fix it would work. The whole file does not 
have a *single* comment!

So please document it much better, document the state machine, document the 
interaction between the rx and tx threads and the rx and tx queue, document the 
interaction between guest and host, outline what happens when any of the queues 
gets full, etc. etc. Parallel state machines are exceedingly complex and 
subtle, not documenting them is one of the seven deadly sins ...

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html