Hi, On 3/24/22 1:33 AM, Jason Gunthorpe wrote: > On Wed, Mar 23, 2022 at 04:51:25PM -0600, Alex Williamson wrote: > >> My overall question here would be whether we can actually achieve a >> compatibility interface that has sufficient feature transparency that we >> can dump vfio code in favor of this interface, or will there be enough >> niche use cases that we need to keep type1 and vfio containers around >> through a deprecation process? > Other than SPAPR, I think we can. > >> The locked memory differences for one seem like something that >> libvirt wouldn't want hidden > I'm first interested to have an understanding how this change becomes > a real problem in practice that requires libvirt to do something > different for vfio or iommufd. We can discuss in the other thread > > If this is the make or break point then I think we can deal with it > either by going back to what vfio does now or perhaps some other > friendly compat approach.. > >> and we have questions regarding support for vaddr hijacking > I'm not sure what vaddr hijacking is? Do you mean > VFIO_DMA_MAP_FLAG_VADDR ? There is a comment that outlines my plan to > implement it in a functionally compatible way without the deadlock > problem. I estimate this as a small project. > >> and different ideas how to implement dirty page tracking, > I don't think this is compatibility. No kernel today triggers qemu to > use this feature as no kernel supports live migration. No existing > qemu will trigger this feature with new kernels that support live > migration v2. Therefore we can adjust qemu's dirty tracking at the > same time we enable migration v2 in qemu. > > With Joao's work we are close to having a solid RFC to come with > something that can be fully implemented. > > Hopefully we can agree to this soon enough that qemu can come with a > full package of migration v2 support including the dirty tracking > solution. > >> not to mention the missing features that are currently well used, >> like p2p mappings, coherency tracking, mdev, etc. > I consider these all mandatory things, they won't be left out. > > The reason they are not in the RFC is mostly because supporting them > requires work outside just this iommufd area, and I'd like this series > to remain self-contained. > > I've already got a draft to add DMABUF support to VFIO PCI which > nicely solves the follow_pfn security problem, we want to do this for > another reason already. I'm waiting for some testing feedback before > posting it. Need some help from Daniel make the DMABUF revoke semantic > him and I have been talking about. In the worst case can copy the > follow_pfn approach. > > Intel no-snoop is simple enough, just needs some Intel cleanup parts. > > mdev will come along with the final VFIO integration, all the really > hard parts are done already. The VFIO integration is a medium sized > task overall. > > So, I'm not ready to give up yet :) > >> Where do we focus attention? Is symlinking device files our proposal >> to userspace and is that something achievable, or do we want to use >> this compatibility interface as a means to test the interface and >> allow userspace to make use of it for transition, if their use cases >> allow it, perhaps eventually performing the symlink after deprecation >> and eventual removal of the vfio container and type1 code? Thanks, > symlinking device files is definitely just a suggested way to expedite > testing. > > Things like qemu that are learning to use iommufd-only features should > learn to directly open iommufd instead of vfio container to activate > those features. > > Looking long down the road I don't think we want to have type 1 and > iommufd code forever. So, I would like to make an option to compile > out vfio container support entirely and have that option arrange for > iommufd to provide the container device node itself. I am currently working on migrating the QEMU VFIO device onto the new API because since after our discussions the compat mode cannot be used anyway to implemented nesting. I hope I will be able to present something next week. Thanks Eric > > I think we can get there pretty quickly, or at least I haven't got > anything that is scaring me alot (beyond SPAPR of course) > > For the dpdk/etcs of the world I think we are already there. > > Jason >