Re: [PATCH 1/2] kvm tools: Respect ISR status in virtio header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/07/2011 05:30 PM, Ingo Molnar wrote:
> 
> * Asias He <asias.hejun@xxxxxxxxx> wrote:
> 
>> Inject IRQ to guest only when ISR status is low which means
>> guest has read ISR status and device has cleared this bit as
>> the side effect of this reading.
>>
>> This reduces a lot of unnecessary IRQ inject from device to
>> guest.
>>
>> Netpef test shows this patch changes:
>>
>> the host to guest bandwidth
>> from 2866.27 Mbps (cpu 33.96%) to 5548.87 Mbps (cpu 53.87%),
>>
>> the guest to host bandwitdth
>> form 1408.86 Mbps (cpu 99.9%) to 1301.29 Mbps (cpu 99.9%).
>>
>> The bottleneck of the guest to host bandwidth is guest cpu power.
>>
>> Signed-off-by: Asias He <asias.hejun@xxxxxxxxx>
>> ---
>>  tools/kvm/include/kvm/virtio.h |    5 +++++
>>  tools/kvm/virtio/core.c        |    8 ++++++++
>>  tools/kvm/virtio/net.c         |   12 ++++++++----
>>  3 files changed, 21 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/kvm/include/kvm/virtio.h b/tools/kvm/include/kvm/virtio.h
>> index e8df8eb..7f92dea 100644
>> --- a/tools/kvm/include/kvm/virtio.h
>> +++ b/tools/kvm/include/kvm/virtio.h
>> @@ -8,6 +8,9 @@
>>  
>>  #include "kvm/kvm.h"
>>  
>> +#define VIRTIO_IRQ_LOW		0
>> +#define VIRTIO_IRQ_HIGH		1
>> +
>>  struct virt_queue {
>>  	struct vring	vring;
>>  	u32		pfn;
>> @@ -37,4 +40,6 @@ struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32
>>  
>>  u16 virt_queue__get_iov(struct virt_queue *queue, struct iovec iov[], u16 *out, u16 *in, struct kvm *kvm);
>>  
>> +void virt_queue__trigger_irq(struct virt_queue *vq, int irq, u8 *isr, struct kvm *kvm);
>> +
>>  #endif /* KVM__VIRTIO_H */
>> diff --git a/tools/kvm/virtio/core.c b/tools/kvm/virtio/core.c
>> index 18d2c41..0734984 100644
>> --- a/tools/kvm/virtio/core.c
>> +++ b/tools/kvm/virtio/core.c
>> @@ -57,3 +57,11 @@ u16 virt_queue__get_iov(struct virt_queue *queue, struct iovec iov[], u16 *out,
>>  
>>  	return head;
>>  }
>> +
>> +void virt_queue__trigger_irq(struct virt_queue *vq, int irq, u8 *isr, struct kvm *kvm)
>> +{
>> +	if (*isr == VIRTIO_IRQ_LOW) {
>> +		*isr = VIRTIO_IRQ_HIGH;
>> +		kvm__irq_line(kvm, irq, VIRTIO_IRQ_HIGH);
>> +	}
>> +}
>> diff --git a/tools/kvm/virtio/net.c b/tools/kvm/virtio/net.c
>> index df69ab3..0189f7d 100644
>> --- a/tools/kvm/virtio/net.c
>> +++ b/tools/kvm/virtio/net.c
>> @@ -35,6 +35,7 @@ struct net_device {
>>  	u32				guest_features;
>>  	u16				config_vector;
>>  	u8				status;
>> +	u8				isr;
>>  	u16				queue_selector;
>>  
>>  	pthread_t			io_rx_thread;
>> @@ -88,8 +89,9 @@ static void *virtio_net_rx_thread(void *p)
>>  			head	= virt_queue__get_iov(vq, iov, &out, &in, self);
>>  			len	= readv(net_device.tap_fd, iov, in);
>>  			virt_queue__set_used_elem(vq, head, len);
>> +
>>  			/* We should interrupt guest right now, otherwise latency is huge. */
>> -			kvm__irq_line(self, VIRTIO_NET_IRQ, 1);
>> +			virt_queue__trigger_irq(vq, VIRTIO_NET_IRQ, &net_device.isr, self);
>>  		}
>>  
>>  	}
>> @@ -123,7 +125,8 @@ static void *virtio_net_tx_thread(void *p)
>>  			virt_queue__set_used_elem(vq, head, len);
>>  		}
>>  
>> -		kvm__irq_line(self, VIRTIO_NET_IRQ, 1);
>> +		virt_queue__trigger_irq(vq, VIRTIO_NET_IRQ, &net_device.isr, self);
>> +
>>  	}
>>  
>>  	pthread_exit(NULL);
>> @@ -175,8 +178,9 @@ static bool virtio_net_pci_io_in(struct kvm *self, u16 port, void *data, int siz
>>  		ioport__write8(data, net_device.status);
>>  		break;
>>  	case VIRTIO_PCI_ISR:
>> -		ioport__write8(data, 0x1);
>> -		kvm__irq_line(self, VIRTIO_NET_IRQ, 0);
>> +		ioport__write8(data, net_device.isr);
>> +		kvm__irq_line(self, VIRTIO_NET_IRQ, VIRTIO_IRQ_LOW);
>> +		net_device.isr = VIRTIO_IRQ_LOW;
>>  		break;
>>  	case VIRTIO_MSI_CONFIG_VECTOR:
>>  		ioport__write16(data, net_device.config_vector);
> 
> Hm, the ISR flag seems to be an explicit IRQ-ack mechanism, not just an 
> optimization.
> 
> Perhaps if the guest kernel side virtio driver expects us to do honor these 
> acks and not inject double irqs when the virtio driver does not expect them?
> 
> There's this code in drivers/virtio/virtio_pci.c:
> 
>         /* reading the ISR has the effect of also clearing it so it's very
>          * important to save off the value. */
>         isr = ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
> 
> Which seems to suggest that this ISR flag is more important than just a 
> performance hint.
> 
> Pekka: was this the patch perhaps that fixed the ping latency problem for you?
> 
> Could any virtio gents on Cc: please confirm/deny this theory? :-)
> 
> The original problem was that the virtio-net driver in tools/kvm/virtio/net.c 
> was producing unexplained latencies (long ping latencies) under certain 
> circumstances. Sometimes it triggered spontaneously, sometimes it needed a ping 
> -f flood to trigger. The root cause of that race is still not understood.
> 

I am using the KVM_IRQ_LINE_STATUS to trigger IRQ instead of
KVM_IRQ_LINE when I am fighting against the network stall/hangs issue.
The KVM_IRQ_LINE_STATUS reports IRQ injection status to userspace. See
commit 4925663a079c77d95d8685228ad6675fc5639c8e for detail.

I found that there are huge IRQ injections with status 0 or -1 when
guest and host are ping flooding each other simultaneously. I think the
root casue is the IRQ race.

I also found when network hangs, guest kernel refuse to give any avail
buffers in rx queue to device. At that time, vq->last_used_idx equals
vq->vring.used->idx in rx queue, so even with manual IRQ injection using
a debug key ctrl-a-i, the network still hangs.

BTW. The ping latency was caused by the movement of irq injection
outside the loop. Suppose we have 5 available buffers and only 1 buffer
from tap device. We will sleep on read without giving the buffer from
tap to guest. The latency will be huge in this case.

  while(virt_queue__available(vq)) {
	...
	read(tap_fd)
	...
   }
   trigger_irq()

-- 
Best Regards,
Asias He
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux