On Mon, Jan 14, 2019 at 07:12:08PM +0000, Robin Murphy wrote: > On 14/01/2019 18:20, Michael S. Tsirkin wrote: > > On Mon, Jan 14, 2019 at 08:41:37PM +0800, Jason Wang wrote: > > > > > > On 2019/1/14 下午5:50, Christoph Hellwig wrote: > > > > On Mon, Jan 14, 2019 at 05:41:56PM +0800, Jason Wang wrote: > > > > > On 2019/1/11 下午5:15, Joerg Roedel wrote: > > > > > > On Fri, Jan 11, 2019 at 11:29:31AM +0800, Jason Wang wrote: > > > > > > > Just wonder if my understanding is correct IOMMU_PLATFORM must be set for > > > > > > > all virtio devices under AMD-SEV guests? > > > > > > Yes, that is correct. Emulated DMA can only happen on the SWIOTLB > > > > > > aperture, because that memory is not encrypted. The guest bounces the > > > > > > data then to its encrypted memory. > > > > > > > > > > > > Regards, > > > > > > > > > > > > Joerg > > > > > > > > > > Thanks, have you tested vhost-net in this case. I suspect it may not work > > > > Which brings me back to my pet pevee that we need to take actions > > > > that virtio uses the proper dma mapping API by default with quirks > > > > for legacy cases. The magic bypass it uses is just causing problems > > > > over problems. > > > > > > > > > Yes, I fully agree with you. This is probably an exact example of such > > > problem. > > > > > > Thanks > > > > I don't think so - the issue is really that DMA API does not yet handle > > the SEV case 100% correctly. I suspect passthrough devices would have > > the same issue. > > Huh? Regardless of which virtio devices use it or not, the DMA API is > handling the SEV case as correctly as it possibly can, by forcing everything > through the unencrypted bounce buffer. If the segments being mapped are too > big for that bounce buffer in the first place, there's nothing it can > possibly do except fail, gracefully or otherwise. It seems reasonable to be able to ask it what it's capabilities are though. > Now, in theory, yes, the real issue at hand is not unique to virtio-blk nor > SEV - any driver whose device has a sufficiently large DMA segment size and > who manages to get sufficient physically-contiguous memory could technically > generate a scatterlist segment longer than SWIOTLB can handle. However, in > practice that basically never happens, not least because very few drivers > ever override the default 64K DMA segment limit. AFAICS nothing in > drivers/virtio is calling dma_set_max_seg_size() or otherwise assigning any > dma_parms to replace the defaults either, so the really interesting question > here is how are these apparently-out-of-spec 256K segments getting generated > at all? > > Robin. I guess this is what you are looking for: /* Host can optionally specify maximum segment size and number of * segments. */ err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX, struct virtio_blk_config, size_max, &v); if (!err) blk_queue_max_segment_size(q, v); else blk_queue_max_segment_size(q, -1U); virtio isn't the only caller with a value >64K: $ git grep -A1 blk_queue_max_segment_size Documentation/block/biodoc.txt: blk_queue_max_segment_size(q, max_seg_size) Documentation/block/biodoc.txt- Maximum size of a clustered segment, 64kB default. -- block/blk-settings.c: * blk_queue_max_segment_size - set max segment size for blk_rq_map_sg block/blk-settings.c- * @q: the request queue for the device -- block/blk-settings.c:void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size) block/blk-settings.c-{ -- block/blk-settings.c:EXPORT_SYMBOL(blk_queue_max_segment_size); block/blk-settings.c- -- drivers/block/mtip32xx/mtip32xx.c: blk_queue_max_segment_size(dd->queue, 0x400000); drivers/block/mtip32xx/mtip32xx.c- blk_queue_io_min(dd->queue, 4096); -- drivers/block/nbd.c: blk_queue_max_segment_size(disk->queue, UINT_MAX); drivers/block/nbd.c- blk_queue_max_segments(disk->queue, USHRT_MAX); -- drivers/block/ps3disk.c: blk_queue_max_segment_size(queue, dev->bounce_size); drivers/block/ps3disk.c- -- drivers/block/ps3vram.c: blk_queue_max_segment_size(queue, BLK_MAX_SEGMENT_SIZE); drivers/block/ps3vram.c- blk_queue_max_hw_sectors(queue, BLK_SAFE_MAX_SECTORS); -- drivers/block/rbd.c: blk_queue_max_segment_size(q, UINT_MAX); drivers/block/rbd.c- blk_queue_io_min(q, objset_bytes); -- drivers/block/sunvdc.c: blk_queue_max_segment_size(q, PAGE_SIZE); drivers/block/sunvdc.c- -- drivers/block/virtio_blk.c: blk_queue_max_segment_size(q, v); drivers/block/virtio_blk.c- else drivers/block/virtio_blk.c: blk_queue_max_segment_size(q, -1U); drivers/block/virtio_blk.c- -- drivers/block/xen-blkfront.c: blk_queue_max_segment_size(rq, PAGE_SIZE); drivers/block/xen-blkfront.c- -- drivers/cdrom/gdrom.c: blk_queue_max_segment_size(gd.gdrom_rq, 0x40000); drivers/cdrom/gdrom.c- gd.disk->queue = gd.gdrom_rq; -- drivers/memstick/core/ms_block.c: blk_queue_max_segment_size(msb->queue, drivers/memstick/core/ms_block.c- MS_BLOCK_MAX_PAGES * msb->page_size); -- drivers/memstick/core/mspro_block.c: blk_queue_max_segment_size(msb->queue, drivers/memstick/core/mspro_block.c- MSPRO_BLOCK_MAX_PAGES * msb->page_size); -- drivers/mmc/core/queue.c: blk_queue_max_segment_size(mq->queue, host->max_seg_size); drivers/mmc/core/queue.c- -- drivers/s390/block/dasd.c: blk_queue_max_segment_size(q, PAGE_SIZE); drivers/s390/block/dasd.c- blk_queue_segment_boundary(q, PAGE_SIZE - 1); -- drivers/scsi/be2iscsi/be_main.c: blk_queue_max_segment_size(sdev->request_queue, 65536); drivers/scsi/be2iscsi/be_main.c- return 0; -- drivers/scsi/scsi_debug.c: blk_queue_max_segment_size(sdp->request_queue, -1U); drivers/scsi/scsi_debug.c- if (sdebug_no_uld) -- drivers/scsi/scsi_lib.c: blk_queue_max_segment_size(q, dma_get_max_seg_size(dev)); drivers/scsi/scsi_lib.c- -- drivers/scsi/ufs/ufshcd.c: blk_queue_max_segment_size(q, PRDT_DATA_BYTE_COUNT_MAX); drivers/scsi/ufs/ufshcd.c- -- include/linux/blkdev.h:extern void blk_queue_max_segment_size(struct request_queue *, unsigned int); include/linux/blkdev.h-extern void blk_queue_max_discard_sectors(struct request_queue *q, -- include/linux/mmc/host.h: unsigned int max_seg_size; /* see blk_queue_max_segment_size */ include/linux/mmc/host.h- unsigned short max_segs; /* see blk_queue_max_segments */ Some of these devices are probably not going to work well if passed through to a SEV guest. Going back to virtio, at some level virtio is like a stacking device so it does not necessarily need a limit. -- MST