On 12/06/2019 19:53, Jacob Pan wrote: >>> You are right, the worst case of the spurious PS is to terminate the >>> group prematurely. Need to know the scope of the HW damage in case >>> of mdev where group IDs can be shared among mdevs belong to the >>> same PF. >> >> But from the IOMMU fault API point of view, the full page request is >> identified by both PRGI and PASID. Given that each mdev has its own >> set of PASIDs, it should be easy to isolate page responses per mdev. >> > On Intel platform, devices sending page request with private data must > receive page response with matching private data. If we solely depend > on PRGI and PASID, we may send stale private data to the device in > those incorrect page response. Since private data may represent PF > device wide contexts, the consequence of sending page response with > wrong private data may affect other mdev/PASID. > > One solution we are thinking to do is to inject the sequence #(e.g. > ktime raw mono clock) as vIOMMU private data into to the guest. Guest > would return this fake private data in page response, then host will > send page response back to the device that matches PRG1 and PASID and > private_data. > > This solution does not expose HW context related private data to the > guest but need to extend page response in iommu uapi. > > /** > * struct iommu_page_response - Generic page response information > * @version: API version of this structure > * @flags: encodes whether the corresponding fields are valid > * (IOMMU_FAULT_PAGE_RESPONSE_* values) > * @pasid: Process Address Space ID > * @grpid: Page Request Group Index > * @code: response code from &enum iommu_page_response_code > * @private_data: private data for the matching page request > */ > struct iommu_page_response { > #define IOMMU_PAGE_RESP_VERSION_1 1 > __u32 version; > #define IOMMU_PAGE_RESP_PASID_VALID (1 << 0) > #define IOMMU_PAGE_RESP_PRIVATE_DATA (1 << 1) > __u32 flags; > __u32 pasid; > __u32 grpid; > __u32 code; > __u32 padding; > __u64 private_data[2]; > }; > > There is also the change needed for separating storage for the real and > fake private data. > > Sorry for the last minute change, did not realize the HW implications. > > I see this as a future extension due to limited testing, I'm wondering how we deal with: (1) old userspace that won't fill the new private_data field in page_response. A new kernel still has to support it. (2) old kernel that won't recognize the new PRIVATE_DATA flag. Currently iommu_page_response() rejects page responses with unknown flags. I guess we'll need a two-way negotiation, where userspace queries whether the kernel supports the flag (2), and the kernel learns whether it should expect the private data to come back (1). > perhaps for > now, can you add paddings similar to page request? Make it 64B as well. I don't think padding is necessary, because iommu_page_response is sent by userspace to the kernel, unlike iommu_fault which is allocated by userspace and filled by the kernel. Page response looks a lot more like existing VFIO mechanisms, so I suppose we'll wrap the iommu_page_response structure and include an argsz parameter at the top: struct vfio_iommu_page_response { u32 argsz; struct iommu_page_response pr; }; struct vfio_iommu_page_response vpr = { .argsz = sizeof(vpr), .pr = ... ... }; ioctl(devfd, VFIO_IOMMU_PAGE_RESPONSE, &vpr); In that case supporting private data can be done by simply appending a field at the end (plus the negotiation above). Thanks, Jean