On 16/11/2018 18:46, Jean-Philippe Brucker wrote: >>> +/* >>> + * __viommu_sync_req - Complete all in-flight requests >>> + * >>> + * Wait for all added requests to complete. When this function returns, all >>> + * requests that were in-flight at the time of the call have completed. >>> + */ >>> +static int __viommu_sync_req(struct viommu_dev *viommu) >>> +{ >>> + int ret = 0; >>> + unsigned int len; >>> + size_t write_len; >>> + struct viommu_request *req; >>> + struct virtqueue *vq = viommu->vqs[VIOMMU_REQUEST_VQ]; >>> + >>> + assert_spin_locked(&viommu->request_lock); >>> + >>> + virtqueue_kick(vq); >>> + >>> + while (!list_empty(&viommu->requests)) { >>> + len = 0; >>> + req = virtqueue_get_buf(vq, &len); >>> + if (!req) >>> + continue; >>> + >>> + if (!len) >>> + viommu_set_req_status(req->buf, req->len, >>> + VIRTIO_IOMMU_S_IOERR); >>> + >>> + write_len = req->len - req->write_offset; >>> + if (req->writeback && len >= write_len) >> I don't get "len >= write_len". Is it valid for the device to write more >> than write_len? If it writes less than write_len, the status is not set, >> is it? Actually, len could well be three bytes smaller than write_len. The spec doesn't require the device to write the three padding bytes in virtio_iommu_req_tail, after the status byte. Here the driver just assumes that the device wrote the reserved field. The QEMU device seems to write uninitialized data in there... Any objection to changing the spec to require the device to initialize those bytes to zero? I think it makes things nicer overall and shouldn't have any performance impact. [...] >>> +static struct iommu_domain *viommu_domain_alloc(unsigned type) >>> +{ >>> + struct viommu_domain *vdomain; >>> + >>> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) >> smmu drivers also support the IDENTITY type. Don't we want to support it >> as well? > > Eventually yes. The IDENTITY domain is useful when an IOMMU has been > forced upon you and gets in the way of performance, in which case you > get rid of it with CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y or > iommu.passthrough=y. For virtio-iommu though, you could simply not > instantiate the device. > > I don't think it's difficult to add: if the device supports > VIRTIO_IOMMU_F_BYPASS, then we simply don't send an ATTACH request. > Otherwise after ATTACH we send a MAP for the whole input range. If the > change isn't bigger than this, I'll add it to v5. Not as straightforward as I hoped, when the device doesn't support VIRTIO_IOMMU_F_BYPASS: * We can't simply create an ID map for the whole input range, we need to carve out the resv_mem regions. * For a VFIO device, the host has no way of forwarding the identity mapping. For example the guest will attempt to map [0; 0xffffffffffff] -> [0; 0xffffffffffff], but VFIO only allows to map RAM and cannot take such request. One solution is to only create ID mapping for RAM, and register a notifier for memory hotplug, like intel-iommu does for IOMMUs that don't support HW pass-through. Since it's not completely trivial and - I think - not urgent, I'll leave this for later. Thanks, Jean