On Tue, Oct 10, 2017 at 10:39:27AM -0700, Dan Williams wrote: > On Tue, Oct 10, 2017 at 10:25 AM, Jason Gunthorpe > >> Have a look at the patch [1], I don't touch the ODP path. > > > > But, does ODP work OK already? I'm not clear on that.. > > It had better. If the mapping is invalidated I would hope that > generates an io fault that gets handled by the driver to setup the new > mapping. I don't see how it can work otherwise. I would assume so too... > > This is why ODP should be the focus because this cannot work fully > > reliably otherwise.. > > The lease break time is configurable. If that application can't > respond to a stop request within a timeout of its own choosing then it > should not be using DAX mappings. Well, no RDMA application can really do this, unless you set the timeout to multiple minutes, on par with network timeouts. Again, these details are why I think this kind of DAX and non ODP-MRs are probably practically not too useful for a production system. Great for test of course, but in that case SIGKILL would be fine too... > > Well, what about using SIGKILL if the lease-break-time hits? The > > kernel will clean up the MRs when the process exits and this will > > fence DMA to that memory. > > Can you point me to where the MR cleanup code fences DMA and quiesces > the device? Yes. The MR's are associated with an fd. When the fd is closed ib_uverbs_close triggers ib_uverbs_cleanup_ucontext which runs through all the objects, including MRs, and deletes them. The specification for deleting a MR requires a synchronous fence with the hardware. After MR deletion the hardware will not DMA to any pages described by the old MR, and those pages will be unpinned. > > But, still, if you really want to be fined graned, then I think > > invalidating the impacted MR's is a better solution for RDMA than > > trying to do it with the IOMMU... > > If there's a better routine for handling ib_umem_lease_break() I'd > love to use it. Right now I'm reaching for the only tool I know for > kernel enforced revocation of DMA access. Well, you'd have to code something in the MR code to keep track of DAX MRs and issue an out of band invalidate to impacted MRs to create the fence. This probably needs some driver work, I'm not sure if all the hardware can do out of band invalidate to any MR or not.. Generally speaking, in RDMA, when a new feature like this comes along we have to push a lot of the work down to the driver authors, and the approach has historically been that new features only work on some hardware (as much as I dislike this, it is pragmatic) So, not being able to support DAX on certain RDMA hardware is not an unreasonable situation in our space. Jason