> From: Jason Wang <jasowang@xxxxxxxxxx> > Sent: Thursday, August 13, 2020 12:34 PM > > > On 2020/8/12 下午12:05, Tian, Kevin wrote: > >> The problem is that if we tie all controls via VFIO uAPI, the other > >> subsystem like vDPA is likely to duplicate them. I wonder if there is a > >> way to decouple the vSVA out of VFIO uAPI? > > vSVA is a per-device (either pdev or mdev) feature thus naturally should > > be managed by its device driver (VFIO or vDPA). From this angle some > > duplication is inevitable given VFIO and vDPA are orthogonal passthrough > > frameworks. Within the kernel the majority of vSVA handling is done by > > IOMMU and IOASID modules thus most logic are shared. > > > So why not introduce vSVA uAPI at IOMMU or IOASID layer? One may ask a similar question why IOMMU doesn't expose map/unmap as uAPI... > > > > > >>> If an userspace DMA interface can be easily > >>> adapted to be a passthrough one, it might be the choice. > >> It's not that easy even for VFIO which requires a lot of new uAPIs and > >> infrastructures(e.g mdev) to be invented. > >> > >> > >>> But for idxd, > >>> we see mdev a much better fit here, given the big difference between > >>> what userspace DMA requires and what guest driver requires in this hw. > >> A weak point for mdev is that it can't serve kernel subsystem other than > >> VFIO. In this case, you need some other infrastructures (like [1]) to do > >> this. > > mdev is not exclusive from kernel usages. It's perfectly fine for a driver > > to reserve some work queues for host usages, while wrapping others > > into mdevs. > > > I meant you may want slices to be an independent device from the kernel > point of view: > > E.g for ethernet devices, you may want 10K mdevs to be passed to guest. > > Similarly, you may want 10K net devices which is connected to the kernel > networking subsystems. > > In this case it's not simply reserving queues but you need some other > type of device abstraction. There could be some kind of duplication > between this and mdev. > yes, some abstraction required but isn't it what the driver should care about instead of mdev framework itself? If the driver reports the same set of resource to both mdev and networking, it needs to make sure when the resource is claimed in one interface then it should be marked in-use in another. e.g. each mdev includes a available_intances attribute. the driver could report 10k available instances initially and then update it to 5K when another 5K is used for net devices later. Mdev definitely has its usage limitations. Some may be improved in the future, some may not. But those are distracting from the original purpose of this thread (mdev vs. userspace DMA) and better be discussed in other places e.g. LPC... Thanks Kevin