On 05/07/2011 05:30 PM, Ingo Molnar wrote: > > * Asias He <asias.hejun@xxxxxxxxx> wrote: > >> Inject IRQ to guest only when ISR status is low which means >> guest has read ISR status and device has cleared this bit as >> the side effect of this reading. >> >> This reduces a lot of unnecessary IRQ inject from device to >> guest. >> >> Netpef test shows this patch changes: >> >> the host to guest bandwidth >> from 2866.27 Mbps (cpu 33.96%) to 5548.87 Mbps (cpu 53.87%), >> >> the guest to host bandwitdth >> form 1408.86 Mbps (cpu 99.9%) to 1301.29 Mbps (cpu 99.9%). >> >> The bottleneck of the guest to host bandwidth is guest cpu power. >> >> Signed-off-by: Asias He <asias.hejun@xxxxxxxxx> >> --- >> tools/kvm/include/kvm/virtio.h | 5 +++++ >> tools/kvm/virtio/core.c | 8 ++++++++ >> tools/kvm/virtio/net.c | 12 ++++++++---- >> 3 files changed, 21 insertions(+), 4 deletions(-) >> >> diff --git a/tools/kvm/include/kvm/virtio.h b/tools/kvm/include/kvm/virtio.h >> index e8df8eb..7f92dea 100644 >> --- a/tools/kvm/include/kvm/virtio.h >> +++ b/tools/kvm/include/kvm/virtio.h >> @@ -8,6 +8,9 @@ >> >> #include "kvm/kvm.h" >> >> +#define VIRTIO_IRQ_LOW 0 >> +#define VIRTIO_IRQ_HIGH 1 >> + >> struct virt_queue { >> struct vring vring; >> u32 pfn; >> @@ -37,4 +40,6 @@ struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32 >> >> u16 virt_queue__get_iov(struct virt_queue *queue, struct iovec iov[], u16 *out, u16 *in, struct kvm *kvm); >> >> +void virt_queue__trigger_irq(struct virt_queue *vq, int irq, u8 *isr, struct kvm *kvm); >> + >> #endif /* KVM__VIRTIO_H */ >> diff --git a/tools/kvm/virtio/core.c b/tools/kvm/virtio/core.c >> index 18d2c41..0734984 100644 >> --- a/tools/kvm/virtio/core.c >> +++ b/tools/kvm/virtio/core.c >> @@ -57,3 +57,11 @@ u16 virt_queue__get_iov(struct virt_queue *queue, struct iovec iov[], u16 *out, >> >> return head; >> } >> + >> +void virt_queue__trigger_irq(struct virt_queue *vq, int irq, u8 *isr, struct kvm *kvm) >> +{ >> + if (*isr == VIRTIO_IRQ_LOW) { >> + *isr = VIRTIO_IRQ_HIGH; >> + kvm__irq_line(kvm, irq, VIRTIO_IRQ_HIGH); >> + } >> +} >> diff --git a/tools/kvm/virtio/net.c b/tools/kvm/virtio/net.c >> index df69ab3..0189f7d 100644 >> --- a/tools/kvm/virtio/net.c >> +++ b/tools/kvm/virtio/net.c >> @@ -35,6 +35,7 @@ struct net_device { >> u32 guest_features; >> u16 config_vector; >> u8 status; >> + u8 isr; >> u16 queue_selector; >> >> pthread_t io_rx_thread; >> @@ -88,8 +89,9 @@ static void *virtio_net_rx_thread(void *p) >> head = virt_queue__get_iov(vq, iov, &out, &in, self); >> len = readv(net_device.tap_fd, iov, in); >> virt_queue__set_used_elem(vq, head, len); >> + >> /* We should interrupt guest right now, otherwise latency is huge. */ >> - kvm__irq_line(self, VIRTIO_NET_IRQ, 1); >> + virt_queue__trigger_irq(vq, VIRTIO_NET_IRQ, &net_device.isr, self); >> } >> >> } >> @@ -123,7 +125,8 @@ static void *virtio_net_tx_thread(void *p) >> virt_queue__set_used_elem(vq, head, len); >> } >> >> - kvm__irq_line(self, VIRTIO_NET_IRQ, 1); >> + virt_queue__trigger_irq(vq, VIRTIO_NET_IRQ, &net_device.isr, self); >> + >> } >> >> pthread_exit(NULL); >> @@ -175,8 +178,9 @@ static bool virtio_net_pci_io_in(struct kvm *self, u16 port, void *data, int siz >> ioport__write8(data, net_device.status); >> break; >> case VIRTIO_PCI_ISR: >> - ioport__write8(data, 0x1); >> - kvm__irq_line(self, VIRTIO_NET_IRQ, 0); >> + ioport__write8(data, net_device.isr); >> + kvm__irq_line(self, VIRTIO_NET_IRQ, VIRTIO_IRQ_LOW); >> + net_device.isr = VIRTIO_IRQ_LOW; >> break; >> case VIRTIO_MSI_CONFIG_VECTOR: >> ioport__write16(data, net_device.config_vector); > > Hm, the ISR flag seems to be an explicit IRQ-ack mechanism, not just an > optimization. > > Perhaps if the guest kernel side virtio driver expects us to do honor these > acks and not inject double irqs when the virtio driver does not expect them? > > There's this code in drivers/virtio/virtio_pci.c: > > /* reading the ISR has the effect of also clearing it so it's very > * important to save off the value. */ > isr = ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR); > > Which seems to suggest that this ISR flag is more important than just a > performance hint. > > Pekka: was this the patch perhaps that fixed the ping latency problem for you? > > Could any virtio gents on Cc: please confirm/deny this theory? :-) > > The original problem was that the virtio-net driver in tools/kvm/virtio/net.c > was producing unexplained latencies (long ping latencies) under certain > circumstances. Sometimes it triggered spontaneously, sometimes it needed a ping > -f flood to trigger. The root cause of that race is still not understood. > I am using the KVM_IRQ_LINE_STATUS to trigger IRQ instead of KVM_IRQ_LINE when I am fighting against the network stall/hangs issue. The KVM_IRQ_LINE_STATUS reports IRQ injection status to userspace. See commit 4925663a079c77d95d8685228ad6675fc5639c8e for detail. I found that there are huge IRQ injections with status 0 or -1 when guest and host are ping flooding each other simultaneously. I think the root casue is the IRQ race. I also found when network hangs, guest kernel refuse to give any avail buffers in rx queue to device. At that time, vq->last_used_idx equals vq->vring.used->idx in rx queue, so even with manual IRQ injection using a debug key ctrl-a-i, the network still hangs. BTW. The ping latency was caused by the movement of irq injection outside the loop. Suppose we have 5 available buffers and only 1 buffer from tap device. We will sleep on read without giving the buffer from tap to guest. The latency will be huge in this case. while(virt_queue__available(vq)) { ... read(tap_fd) ... } trigger_irq() -- Best Regards, Asias He -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html