----- Original Message ----- > From: "eperezma" <eperezma@xxxxxxxxxx> > To: "Guo Zhi" <qtxuning1999@xxxxxxxxxxx> > Cc: "jasowang" <jasowang@xxxxxxxxxx>, "sgarzare" <sgarzare@xxxxxxxxxx>, "Michael Tsirkin" <mst@xxxxxxxxxx>, "netdev" > <netdev@xxxxxxxxxxxxxxx>, "linux-kernel" <linux-kernel@xxxxxxxxxxxxxxx>, "kvm list" <kvm@xxxxxxxxxxxxxxx>, > "virtualization" <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx> > Sent: Friday, July 22, 2022 3:07:17 PM > Subject: Re: [RFC 1/5] vhost: reorder used descriptors in a batch > On Thu, Jul 21, 2022 at 10:44 AM Guo Zhi <qtxuning1999@xxxxxxxxxxx> wrote: >> >> Device may not use descriptors in order, for example, NIC and SCSI may >> not call __vhost_add_used_n with buffers in order. It's the task of >> __vhost_add_used_n to order them. This commit reorder the buffers using >> vq->heads, only the batch is begin from the expected start point and is >> continuous can the batch be exposed to driver. And only writing out a >> single used ring for a batch of descriptors, according to VIRTIO 1.1 >> spec. >> >> Signed-off-by: Guo Zhi <qtxuning1999@xxxxxxxxxxx> >> --- >> drivers/vhost/vhost.c | 44 +++++++++++++++++++++++++++++++++++++++++-- >> drivers/vhost/vhost.h | 3 +++ >> 2 files changed, 45 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >> index 40097826c..e2e77e29f 100644 >> --- a/drivers/vhost/vhost.c >> +++ b/drivers/vhost/vhost.c >> @@ -317,6 +317,7 @@ static void vhost_vq_reset(struct vhost_dev *dev, >> vq->used_flags = 0; >> vq->log_used = false; >> vq->log_addr = -1ull; >> + vq->next_used_head_idx = 0; >> vq->private_data = NULL; >> vq->acked_features = 0; >> vq->acked_backend_features = 0; >> @@ -398,6 +399,8 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev) >> GFP_KERNEL); >> if (!vq->indirect || !vq->log || !vq->heads) >> goto err_nomem; >> + >> + memset(vq->heads, 0, sizeof(*vq->heads) * dev->iov_limit); >> } >> return 0; >> >> @@ -2374,12 +2377,49 @@ static int __vhost_add_used_n(struct vhost_virtqueue >> *vq, >> unsigned count) >> { >> vring_used_elem_t __user *used; >> + struct vring_desc desc; >> u16 old, new; >> int start; >> + int begin, end, i; >> + int copy_n = count; >> + >> + if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) { >> + /* calculate descriptor chain length for each used buffer */ >> + for (i = 0; i < count; i++) { >> + begin = heads[i].id; >> + end = begin; >> + vq->heads[begin].len = 0; >> + do { >> + vq->heads[begin].len += 1; >> + if (unlikely(vhost_get_desc(vq, &desc, end))) { >> + vq_err(vq, "Failed to get descriptor: >> idx %d addr %p\n", >> + end, vq->desc + end); >> + return -EFAULT; >> + } >> + } while ((end = next_desc(vq, &desc)) != -1); >> + } >> + >> + count = 0; >> + /* sort and batch continuous used ring entry */ >> + while (vq->heads[vq->next_used_head_idx].len != 0) { >> + count++; >> + i = vq->next_used_head_idx; >> + vq->next_used_head_idx = (vq->next_used_head_idx + >> + >> vq->heads[vq->next_used_head_idx].len) >> + % vq->num; >> + vq->heads[i].len = 0; >> + } > > You're iterating vq->heads with two different indexes here. > > The first loop is working with indexes [0, count), which is fine if > heads is a "cache" and everything can be overwritten (as it used to be > before this patch). > > The other loop trusts in vq->next_used_head_idx, which is saved between calls. > > So both uses are going to conflict with each other. > The first loop is to calculate the length of each descriptor, and the next is to find the begin point of next batch. The next loop contains the first loop. > A proposal for checking this is to push the data in the chains > incrementally at the virtio_test driver, and check that they are > returned properly. Like, the first buffer in the chain has the value > of N, the second one N+1, and so on. > LGTM. I'll try this to enhance the test. > Let's split saving chains in its own patch. > > >> + /* only write out a single used ring entry with the id >> corresponding >> + * to the head entry of the descriptor chain describing the last >> buffer >> + * in the batch. >> + */ > > Let's delay the batching for now, we can add it as an optimization on > top in the case of devices. > > My proposal is to define a new struct vring_used_elem_inorder: > > struct vring_used_elem_inorder { > uint16_t written' > uint16_t num; > } > > And create a per vq array of them, with vq->num size. Let's call it > used_inorder for example. > > Everytime the device uses a buffer chain of N buffers, written L and > first descriptor id D, it stores vq->used_inorder[D] = { .written = L, > .num = N }. .num == 0 means the buffer is not available. > > After storing that information, you have your next_used_head_idx. You > can check if vq->used_inorder[next_used_head_idx] is used (.num != 0). > In case is not, there is no need to perform any actions for now. > > In case it is, you iterate vq->used_inorder. First you write as used > next_used_head_idx. After that, next_used_head_idx increments by .num, > and we need to clean .num. If vq->used_inorder[vq->next_used_head_idx] > is used too, repeat. > > I think we could even squash vq->heads and vq->used_inorder with some > tricks, because a chain's length would always be bigger or equal than > used descriptor one, but to store in a different array would be more > clear. > I think this algorithm is the same with that in the patch. But it is better to add a struct named vring_used_elem_inorder instead of vq->heads, which is more clear. >> + heads[0].id = i; >> + copy_n = 1; > > The device must not write anything to the used ring if the next > descriptor has not been used. I'm failing to trace how this works when > the second half of the batch in vhost/test.c is used here. > > Thanks! > > Sorry for my mistake, I forgot add the check if(count == 0) in the patch. >> + } >> >> start = vq->last_used_idx & (vq->num - 1); >> used = vq->used->ring + start; >> - if (vhost_put_used(vq, heads, start, count)) { >> + if (vhost_put_used(vq, heads, start, copy_n)) { >> vq_err(vq, "Failed to write used"); >> return -EFAULT; >> } >> @@ -2410,7 +2450,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct >> vring_used_elem *heads, >> >> start = vq->last_used_idx & (vq->num - 1); >> n = vq->num - start; >> - if (n < count) { >> + if (n < count && !vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) { >> r = __vhost_add_used_n(vq, heads, n); >> if (r < 0) >> return r; >> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h >> index d9109107a..7b2c0fbb5 100644 >> --- a/drivers/vhost/vhost.h >> +++ b/drivers/vhost/vhost.h >> @@ -107,6 +107,9 @@ struct vhost_virtqueue { >> bool log_used; >> u64 log_addr; >> >> + /* Sort heads in order */ >> + u16 next_used_head_idx; >> + >> struct iovec iov[UIO_MAXIOV]; >> struct iovec iotlb_iov[64]; >> struct iovec *indirect; >> -- >> 2.17.1 >>