On 12/07/2011 07:10 AM, Sridhar Samudrala wrote:
On 12/6/2011 8:14 AM, Michael S. Tsirkin wrote:
On Tue, Dec 06, 2011 at 07:42:54AM -0800, Sridhar Samudrala wrote:
On 12/6/2011 5:15 AM, Stefan Hajnoczi wrote:
On Tue, Dec 6, 2011 at 10:21 AM, Jason Wang<jasowang@xxxxxxxxxx>
wrote:
On 12/06/2011 05:18 PM, Stefan Hajnoczi wrote:
On Tue, Dec 6, 2011 at 6:33 AM, Jason
Wang<jasowang@xxxxxxxxxx> wrote:
On 12/05/2011 06:55 PM, Stefan Hajnoczi wrote:
On Mon, Dec 5, 2011 at 8:59 AM, Jason Wang<jasowang@xxxxxxxxxx>
wrote:
The vcpus are just threads and may not be bound to physical CPUs, so
what is the big picture here? Is the guest even in the position to
set the best queue mappings today?
Not sure it could publish the best mapping but the idea is to make
sure the
packets of a flow were handled by the same guest vcpu and may be
the same
vhost thread in order to eliminate the packet reordering and lock
contention. But this assumption does not take the bouncing of
vhost or vcpu
threads which would also affect the result.
Okay, this is why I'd like to know what the big picture here is. What
solution are you proposing? How are we going to have everything from
guest application, guest kernel, host threads, and host NIC driver
play along so we get the right steering up the entire stack. I think
there needs to be an answer to that before changing virtio-net to add
any steering mechanism.
Yes. Also the current model of a vhost thread per VM's interface
doesn't help with packet steering
all the way from the guest to the host physical NIC.
I think we need to have vhost thread(s) per-CPU that can handle
packets to/from physical NIC's
TX/RX queues.
Currently we have a single vhost thread for a VM's i/f
that handles all the packets from
various flows coming from a multi-queue physical NIC.
Thanks
Sridhar
It's not hard to try that:
1. revert c23f3445e68e1db0e74099f264bc5ff5d55ebdeb
this will convert our thread to a workqueue
2. convert the workqueue to a per-cpu one
It didn't work that well in the past, but YMMV
Yes. I tried this before we went ahead with per-interface vhost
threading model.
At that time, per-cpu vhost showed a regression with a single-VM and
per-vq vhost showed good performance improvements upto 8 VMs.
So just making it per-cpu would not be enough. I think we may need a way
to schedule vcpu threads on the same cpu-socket as vhost.
Another aspect we need to look into is the splitting of vhost thread
into separate
threads for TX and RX. Shirley is doing some work in this area and she
is seeing
perf. improvements as long as TX and RX threads are on the same
cpu-socket.
I emulated this through my multi-queue series in the past, looks like it
damages the performance of single stream especially guest tx.
On the surface I'd say a single thread makes some sense
as long as guest uses a single queue.
But this may not be scalable long term when we want to support a large
number of VMs each
having multiple virtio-net interfaces with multiple queues.
Thanks
Sridhar
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html