On Thu, May 06, 2021 at 11:20:30AM +0800, Jason Wang wrote: > > 在 2021/4/23 下午4:09, Jason Wang 写道: > > Hi: > > > > Sometimes, the driver doesn't trust the device. This is usually > > happens for the encrtpyed VM or VDUSE[1]. In both cases, technology > > like swiotlb is used to prevent the poking/mangling of memory from the > > device. But this is not sufficient since current virtio driver may > > trust what is stored in the descriptor table (coherent mapping) for > > performing the DMA operations like unmap and bounce so the device may > > choose to utilize the behaviour of swiotlb to perform attacks[2]. > > > > To protect from a malicous device, this series store and use the > > descriptor metadata in an auxiliay structure which can not be accessed > > via swiotlb instead of the ones in the descriptor table. This means > > the descriptor table is write-only from the view of the driver. > > > > Actually, we've almost achieved that through packed virtqueue and we > > just need to fix a corner case of handling mapping errors. For split > > virtqueue we just follow what's done in the packed. > > > > Note that we don't duplicate descriptor medata for indirect > > descriptors since it uses stream mapping which is read only so it's > > safe if the metadata of non-indirect descriptors are correct. > > > > For split virtqueue, the change increase the footprint due the the > > auxiliary metadata but it's almost neglectlable in the simple test > > like pktgen or netpef. > > > > Slightly tested with packed on/off, iommu on/of, swiotlb force/off in > > the guest. > > > > Please review. > > > > Changes from V1: > > - Always use auxiliary metadata for split virtqueue > > - Don't read from descripto when detaching indirect descriptor > > > Hi Michael: > > Our QE see no regression on the perf test for 10G but some regressions > (5%-10%) on 40G card. > > I think this is expected since we increase the footprint, are you OK with > this and we can try to optimize on top or you have other ideas? > > Thanks Let's try for just a bit, won't make this window anyway: I have an old idea. Add a way to find out that unmap is a nop (or more exactly does not use the address/length). Then in that case even with DMA API we do not need the extra data. Hmm? > > > > > [1] > > https://lore.kernel.org/netdev/fab615ce-5e13-a3b3-3715-a4203b4ab010@xxxxxxxxxx/T/ > > [2] > > https://yhbt.net/lore/all/c3629a27-3590-1d9f-211b-c0b7be152b32@xxxxxxxxxx/T/#mc6b6e2343cbeffca68ca7a97e0f473aaa871c95b > > > > Jason Wang (7): > > virtio-ring: maintain next in extra state for packed virtqueue > > virtio_ring: rename vring_desc_extra_packed > > virtio-ring: factor out desc_extra allocation > > virtio_ring: secure handling of mapping errors > > virtio_ring: introduce virtqueue_desc_add_split() > > virtio: use err label in __vring_new_virtqueue() > > virtio-ring: store DMA metadata in desc_extra for split virtqueue > > > > drivers/virtio/virtio_ring.c | 201 +++++++++++++++++++++++++---------- > > 1 file changed, 144 insertions(+), 57 deletions(-) > >