So the idea is as follows: we convert descriptors to an independent format first, and process that converting to iov later. The point is that we have a tight loop that fetches descriptors, which is good for cache utilization. This will also allow all kind of batching tricks - e.g. it seems possible to keep SMAP disabled while we are fetching multiple descriptors. And perhaps more importantly, this is a very good fit for the packed ring layout, where we get and put descriptors in order. This patchset seems to already perform exactly the same as the original code already based on a microbenchmark. More testing would be very much appreciated. Biggest TODO before this first step is ready to go in is to batch indirect descriptors as well. Integrating into vhost-net is basically s/vhost_get_vq_desc/vhost_get_vq_desc_batch/ - or add a module parameter like I did in the test module. Michael S. Tsirkin (2): vhost: option to fetch descriptors through an independent struct vhost: batching fetches drivers/vhost/test.c | 19 ++- drivers/vhost/vhost.c | 333 +++++++++++++++++++++++++++++++++++++++++- drivers/vhost/vhost.h | 20 ++- 3 files changed, 365 insertions(+), 7 deletions(-) -- MST