On Mon, 22 Nov 2021 12:08:22 +0100 Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > On Mon, Nov 22, 2021 at 08:55:24AM +0100, Stefano Garzarella wrote: > >On Mon, Nov 22, 2021 at 02:25:26PM +0800, Jason Wang wrote: > >>On Mon, Nov 22, 2021 at 1:49 PM Halil Pasic <pasic@xxxxxxxxxxxxx> wrote: > >>> > >>>On Mon, 22 Nov 2021 06:35:18 +0100 > >>>Halil Pasic <pasic@xxxxxxxxxxxxx> wrote: > >>> > >>>> > I think it should be a common issue, looking at > >>>> > vhost_vsock_handle_tx_kick(), it did: > >>>> > > >>>> > len += sizeof(pkt->hdr); > >>>> > vhost_add_used(vq, head, len); > >>>> > > >>>> > which looks like a violation of the spec since it's TX. > >>>> > >>>> I'm not sure the lines above look like a violation of the spec. If you > >>>> examine vhost_vsock_alloc_pkt() I believe that you will agree that: > >>>> len == pkt->len == pkt->hdr.len > >>>> which makes sense since according to the spec both tx and rx messages > >>>> are hdr+payload. And I believe hdr.len is the size of the payload, > >>>> although that does not seem to be properly documented by the spec. > >> > >>Sorry for being unclear, what I meant is that we probably should use > >>zero here. TX doesn't use in buffer actually. > >> > >>According to the spec, 0 should be the used length: > >> > >>"and len the total of bytes written into the buffer." > >> > >>>> > >>>> On the other hand tx messages are stated to be device read-only (in the > >>>> spec) so if the device writes stuff, that is certainly wrong. > >>>> > >> > >>Yes. > >> > >>>> If that is what happens. > >>>> > >>>> Looking at virtqueue_get_buf_ctx_split() I'm not sure that is what > >>>> happens. My hypothesis is that we just a last descriptor is an 'in' > >>>> type descriptor (i.e. a device writable one). For tx that assumption > >>>> would be wrong. > >>>> > >>>> I will have another look at this today and send a fix patch if my > >>>> suspicion is confirmed. > >>> > >>>If my suspicion is right something like: > >>> > >>>diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > >>>index 00f64f2f8b72..efb57898920b 100644 > >>>--- a/drivers/virtio/virtio_ring.c > >>>+++ b/drivers/virtio/virtio_ring.c > >>>@@ -764,6 +764,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, > >>> struct vring_virtqueue *vq = to_vvq(_vq); > >>> void *ret; > >>> unsigned int i; > >>>+ bool has_in; > >>> u16 last_used; > >>> > >>> START_USE(vq); > >>>@@ -787,6 +788,9 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, > >>> vq->split.vring.used->ring[last_used].id); > >>> *len = virtio32_to_cpu(_vq->vdev, > >>> vq->split.vring.used->ring[last_used].len); > >>>+ has_in = virtio16_to_cpu(_vq->vdev, > >>>+ vq->split.vring.used->ring[last_used].flags) > >>>+ & VRING_DESC_F_WRITE; > >> > >>Did you mean vring.desc actually? If yes, it's better not depend on > >>the descriptor ring which can be modified by the device. We've stored > >>the flags in desc_extra[]. > >> > >>> > >>> if (unlikely(i >= vq->split.vring.num)) { > >>> BAD_RING(vq, "id %u out of range\n", i); > >>>@@ -796,7 +800,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, > >>> BAD_RING(vq, "id %u is not a head!\n", i); > >>> return NULL; > >>> } > >>>- if (vq->buflen && unlikely(*len > vq->buflen[i])) { > >>>+ if (has_in && q->buflen && unlikely(*len > vq->buflen[i])) { > >>> BAD_RING(vq, "used len %d is larger than in buflen %u\n", > >>> *len, vq->buflen[i]); > >>> return NULL; > >>> > >>>would fix the problem for split. I will try that out and let you know > >>>later. > >> > >>I'm not sure I get this, in virtqueue_add_split, the buflen[i] only > >>contains the in buffer length. > >> > >>I think the fixes are: > >> > >>1) fixing the vhost vsock > > > >Yep, in vhost_vsock_handle_tx_kick() we should have vhost_add_used(vq, > >head, 0) since the device doesn't write anything. > > > >>2) use suppress_used_validation=true to let vsock driver to validate > >>the in buffer length > >>3) probably a new feature so the driver can only enable the validation > >>when the feature is enabled. > > > >I fully agree with these steps. > > Michael sent a patch to suppress the validation, so I think we should > just fix vhost-vsock. I mean something like this: > > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c > index 938aefbc75ec..4e3b95af7ee4 100644 > --- a/drivers/vhost/vsock.c > +++ b/drivers/vhost/vsock.c > @@ -554,7 +554,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) > virtio_transport_free_pkt(pkt); > > len += sizeof(pkt->hdr); > - vhost_add_used(vq, head, len); > + vhost_add_used(vq, head, 0); > total_len += len; > added = true; > } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len))); > > I checked and the problem is there from the first commit, so we should > add: > > Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") > > I tested this patch and it works even without suppressing validation in > the virtio core. But for backwards compatibility we have to suppress it > for sure as Michael did. > > Maybe we can have a patch just with this change to backport it easily > and one after to clean up a bit the code that was added after (len, > total_len). > > @Halil Let me know if you want to do it, otherwise I can do it. > It is fine, it was you guys who figured out the solution so I think it should either be Jason or you who take credit for the patch. Thanks for addressing the issue this quickly! Regards, Halil _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization