Hi Jason, On Thu, 17 Sep 2020 11:53:49 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > On 2020/9/17 上午7:09, Jacob Pan (Jun) wrote: > > Hi Jason, > > On Wed, 16 Sep 2020 15:38:41 -0300, Jason Gunthorpe <jgg@xxxxxxxxxx> > > wrote: > > > >> On Wed, Sep 16, 2020 at 11:21:10AM -0700, Jacob Pan (Jun) wrote: > >>> Hi Jason, > >>> On Wed, 16 Sep 2020 14:01:13 -0300, Jason Gunthorpe > >>> <jgg@xxxxxxxxxx> wrote: > >>> > >>>> On Wed, Sep 16, 2020 at 09:33:43AM -0700, Raj, Ashok wrote: > >>>>> On Wed, Sep 16, 2020 at 12:07:54PM -0300, Jason Gunthorpe > >>>>> wrote: > >>>>>> On Tue, Sep 15, 2020 at 05:22:26PM -0700, Jacob Pan (Jun) > >>>>>> wrote: > >>>>>>>> If user space wants to bind page tables, create the PASID > >>>>>>>> with /dev/sva, use ioctls there to setup the page table > >>>>>>>> the way it wants, then pass the now configured PASID to a > >>>>>>>> driver that can use it. > >>>>>>> Are we talking about bare metal SVA? > >>>>>> What a weird term. > >>>>> Glad you noticed it at v7 :-) > >>>>> > >>>>> Any suggestions on something less weird than > >>>>> Shared Virtual Addressing? There is a reason why we moved from > >>>>> SVM to SVA. > >>>> SVA is fine, what is "bare metal" supposed to mean? > >>>> > >>> What I meant here is sharing virtual address between DMA and host > >>> process. This requires devices perform DMA request with PASID and > >>> use IOMMU first level/stage 1 page tables. > >>> This can be further divided into 1) user SVA 2) supervisor SVA > >>> (sharing init_mm) > >>> > >>> My point is that /dev/sva is not useful here since the driver can > >>> perform PASID allocation while doing SVA bind. > >> No, you are thinking too small. > >> > >> Look at VDPA, it has a SVA uAPI. Some HW might use PASID for the > >> SVA. > > Could you point to me the SVA UAPI? I couldn't find it in the > > mainline. Seems VDPA uses VHOST interface? > > > It's the vhost_iotlb_msg defined in uapi/linux/vhost_types.h. > Thanks for the pointer, for complete vSVA functionality we would need 1 TLB flush (IOTLB and PASID cache etc.) 2 PASID alloc/free 3 bind/unbind page tables or PASID tables 4 Page request service Seems vhost_iotlb_msg can be used for #1 partially. And the proposal is to pluck out the rest into /dev/sda? Seems awkward as Alex pointed out earlier for similar situation in VFIO. > > > > >> When VDPA is used by DPDK it makes sense that the PASID will be SVA > >> and 1:1 with the mm_struct. > >> > > I still don't see why bare metal DPDK needs to get a handle of the > > PASID. > > > My understanding is that it may: > > - have a unified uAPI with vSVA: alloc, bind, unbind, free Got your point, but vSVA needs more than these > - leave the binding policy to userspace instead of the using a > implied one in the kenrel > Only if necessary. > > > Perhaps the SVA patch would explain. Or are you talking about > > vDPA DPDK process that is used to support virtio-net-pmd in the > > guest? > >> When VDPA is used by qemu it makes sense that the PASID will be an > >> arbitary IOVA map constructed to be 1:1 with the guest vCPU > >> physical map. /dev/sva allows a single uAPI to do this kind of > >> setup, and qemu can support it while supporting a range of SVA > >> kernel drivers. VDPA and vfio-mdev are obvious initial targets. > >> > >> *BOTH* are needed. > >> > >> In general any uAPI for PASID should have the option to use either > >> the mm_struct SVA PASID *OR* a PASID from /dev/sva. It costs > >> virtually nothing to implement this in the driver as PASID is just > >> a number, and gives so much more flexability. > >> > > Not really nothing in terms of PASID life cycles. For example, if > > user uses uacce interface to open an accelerator, it gets an > > FD_acc. Then it opens /dev/sva to allocate PASID then get another > > FD_pasid. Then we pass FD_pasid to the driver to bind page tables, > > perhaps multiple drivers. Now we have to worry about If FD_pasid > > gets closed before FD_acc(s) closed and all these race conditions. > > > I'm not sure I understand this. But this demonstrates the flexibility > of an unified uAPI. E.g it allows vDPA and VFIO device to use the > same PAISD which can be shared with a process in the guest. > This is for user DMA not for vSVA. I was contending that /dev/sva creates unnecessary steps for such usage. For vSVA, I think vDPA and VFIO can potentially share but I am not seeing convincing benefits. If a guest process wants to do SVA with a VFIO assigned device and a vDPA-backed virtio-net at the same time, it might be a limitation if PASID is not managed via a common interface. But I am not sure how vDPA SVA support will look like, does it support gIOVA? need virtio IOMMU? > For the race condition, it could be probably solved with refcnt. > Agreed but the best solution might be not to have the problem in the first place :) > Thanks > > > > > > If we do not expose FD_pasid to the user, the teardown is much > > simpler and streamlined. Following each FD_acc close, PASID unbind > > is performed. > >>> Yi can correct me but this set is is about VFIO-PCI, VFIO-mdev > >>> will be introduced later. > >> Last patch is: > >> > >> vfio/type1: Add vSVA support for IOMMU-backed mdevs > >> > >> So pretty hard to see how this is not about vfio-mdev, at least a > >> little.. > >> > >> Jason > > > > Thanks, > > > > Jacob > > > Thanks, Jacob