Re: [RFC 1/5] vhost: reorder used descriptors in a batch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "eperezma" <eperezma@xxxxxxxxxx>
> To: "Guo Zhi" <qtxuning1999@xxxxxxxxxxx>
> Cc: "jasowang" <jasowang@xxxxxxxxxx>, "sgarzare" <sgarzare@xxxxxxxxxx>, "Michael Tsirkin" <mst@xxxxxxxxxx>, "netdev"
> <netdev@xxxxxxxxxxxxxxx>, "linux-kernel" <linux-kernel@xxxxxxxxxxxxxxx>, "kvm list" <kvm@xxxxxxxxxxxxxxx>,
> "virtualization" <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx>
> Sent: Friday, July 22, 2022 3:07:17 PM
> Subject: Re: [RFC 1/5] vhost: reorder used descriptors in a batch

> On Thu, Jul 21, 2022 at 10:44 AM Guo Zhi <qtxuning1999@xxxxxxxxxxx> wrote:
>>
>> Device may not use descriptors in order, for example, NIC and SCSI may
>> not call __vhost_add_used_n with buffers in order.  It's the task of
>> __vhost_add_used_n to order them.  This commit reorder the buffers using
>> vq->heads, only the batch is begin from the expected start point and is
>> continuous can the batch be exposed to driver.  And only writing out a
>> single used ring for a batch of descriptors, according to VIRTIO 1.1
>> spec.
>>
>> Signed-off-by: Guo Zhi <qtxuning1999@xxxxxxxxxxx>
>> ---
>>  drivers/vhost/vhost.c | 44 +++++++++++++++++++++++++++++++++++++++++--
>>  drivers/vhost/vhost.h |  3 +++
>>  2 files changed, 45 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 40097826c..e2e77e29f 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -317,6 +317,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
>>         vq->used_flags = 0;
>>         vq->log_used = false;
>>         vq->log_addr = -1ull;
>> +       vq->next_used_head_idx = 0;
>>         vq->private_data = NULL;
>>         vq->acked_features = 0;
>>         vq->acked_backend_features = 0;
>> @@ -398,6 +399,8 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
>>                                           GFP_KERNEL);
>>                 if (!vq->indirect || !vq->log || !vq->heads)
>>                         goto err_nomem;
>> +
>> +               memset(vq->heads, 0, sizeof(*vq->heads) * dev->iov_limit);
>>         }
>>         return 0;
>>
>> @@ -2374,12 +2377,49 @@ static int __vhost_add_used_n(struct vhost_virtqueue
>> *vq,
>>                             unsigned count)
>>  {
>>         vring_used_elem_t __user *used;
>> +       struct vring_desc desc;
>>         u16 old, new;
>>         int start;
>> +       int begin, end, i;
>> +       int copy_n = count;
>> +
>> +       if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
>> +               /* calculate descriptor chain length for each used buffer */
>> +               for (i = 0; i < count; i++) {
>> +                       begin = heads[i].id;
>> +                       end = begin;
>> +                       vq->heads[begin].len = 0;
>> +                       do {
>> +                               vq->heads[begin].len += 1;
>> +                               if (unlikely(vhost_get_desc(vq, &desc, end))) {
>> +                                       vq_err(vq, "Failed to get descriptor:
>> idx %d addr %p\n",
>> +                                              end, vq->desc + end);
>> +                                       return -EFAULT;
>> +                               }
>> +                       } while ((end = next_desc(vq, &desc)) != -1);
>> +               }
>> +
>> +               count = 0;
>> +               /* sort and batch continuous used ring entry */
>> +               while (vq->heads[vq->next_used_head_idx].len != 0) {
>> +                       count++;
>> +                       i = vq->next_used_head_idx;
>> +                       vq->next_used_head_idx = (vq->next_used_head_idx +
>> +
>> vq->heads[vq->next_used_head_idx].len)
>> +                                                 % vq->num;
>> +                       vq->heads[i].len = 0;
>> +               }
> 
> You're iterating vq->heads with two different indexes here.
> 
> The first loop is working with indexes [0, count), which is fine if
> heads is a "cache" and everything can be overwritten (as it used to be
> before this patch).
> 
> The other loop trusts in vq->next_used_head_idx, which is saved between calls.
> 
> So both uses are going to conflict with each other.
> 

The first loop is to calculate the length of each descriptor, and the next is to find
the begin point of next batch. The next loop contains the first loop.

> A proposal for checking this is to push the data in the chains
> incrementally at the virtio_test driver, and check that they are
> returned properly. Like, the first buffer in the chain has the value
> of N, the second one N+1, and so on.
> 

LGTM. I'll try this to enhance the test.

> Let's split saving chains in its own patch.
> 
> 
>> +               /* only write out a single used ring entry with the id
>> corresponding
>> +                * to the head entry of the descriptor chain describing the last
>> buffer
>> +                * in the batch.
>> +                */
> 
> Let's delay the batching for now, we can add it as an optimization on
> top in the case of devices.
> 
> My proposal is to define a new struct vring_used_elem_inorder:
> 
> struct vring_used_elem_inorder {
>    uint16_t written'
>    uint16_t num;
> }
> 
> And create a per vq array of them, with vq->num size. Let's call it
> used_inorder for example.
> 
> Everytime the device uses a buffer chain of N buffers, written L and
> first descriptor id D, it stores vq->used_inorder[D] = { .written = L,
> .num = N }. .num == 0 means the buffer is not available.
> 
> After storing that information, you have your next_used_head_idx. You
> can check if vq->used_inorder[next_used_head_idx] is used (.num != 0).
> In case is not, there is no need to perform any actions for now.
> 
> In case it is, you iterate vq->used_inorder. First you write as used
> next_used_head_idx. After that, next_used_head_idx increments by .num,
> and we need to clean .num. If vq->used_inorder[vq->next_used_head_idx]
> is used too, repeat.
> 
> I think we could even squash vq->heads and vq->used_inorder with some
> tricks, because a chain's length would always be bigger or equal than
> used descriptor one, but to store in a different array would be more
> clear.
> 

I think this algorithm is the same with that in the patch. But it is better
to add a struct named vring_used_elem_inorder instead of vq->heads, which 
is more clear.

>> +               heads[0].id = i;
>> +               copy_n = 1;
> 
> The device must not write anything to the used ring if the next
> descriptor has not been used. I'm failing to trace how this works when
> the second half of the batch in vhost/test.c is used here.
> 
> Thanks!
> 
> 

Sorry for my mistake, I forgot add the check if(count == 0) in the patch.

>> +       }
>>
>>         start = vq->last_used_idx & (vq->num - 1);
>>         used = vq->used->ring + start;
>> -       if (vhost_put_used(vq, heads, start, count)) {
>> +       if (vhost_put_used(vq, heads, start, copy_n)) {
>>                 vq_err(vq, "Failed to write used");
>>                 return -EFAULT;
>>         }
>> @@ -2410,7 +2450,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct
>> vring_used_elem *heads,
>>
>>         start = vq->last_used_idx & (vq->num - 1);
>>         n = vq->num - start;
>> -       if (n < count) {
>> +       if (n < count && !vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
>>                 r = __vhost_add_used_n(vq, heads, n);
>>                 if (r < 0)
>>                         return r;
>> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
>> index d9109107a..7b2c0fbb5 100644
>> --- a/drivers/vhost/vhost.h
>> +++ b/drivers/vhost/vhost.h
>> @@ -107,6 +107,9 @@ struct vhost_virtqueue {
>>         bool log_used;
>>         u64 log_addr;
>>
>> +       /* Sort heads in order */
>> +       u16 next_used_head_idx;
>> +
>>         struct iovec iov[UIO_MAXIOV];
>>         struct iovec iotlb_iov[64];
>>         struct iovec *indirect;
>> --
>> 2.17.1
>>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux