On Tue, Oct 10, 2017 at 10:25 AM, Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote: > >> > I don't think this has ever come up in the context of an all-device MR >> > invalidate requirement. Drivers already have code to invalidate >> > specifc MRs, but to find all MRs that touch certain pages and then >> > invalidate them would be new code. >> > >> > We also have ODP aware drivers that can retarget a MR to new >> > physical pages. If the block map changes DAX should synchronously >> > retarget the ODP MR, not halt DMA. >> >> Have a look at the patch [1], I don't touch the ODP path. > > But, does ODP work OK already? I'm not clear on that.. It had better. If the mapping is invalidated I would hope that generates an io fault that gets handled by the driver to setup the new mapping. I don't see how it can work otherwise. >> > Most likely ODP & DAX would need to be used together to get robust >> > user applications, as having the user QP's go to an error state at >> > random times (due to DMA failures) during operation is never going to >> > be acceptable... >> >> It's not random. The process that set up the mapping and registered >> the memory gets SIGIO when someone else tries to modify the file map. >> That process then gets /proc/sys/fs/lease-break-time seconds to fix >> the problem before the kernel force revokes the DMA access. > > Well, the process can't fix the problem in bounded time, so it is > random if it will fail or not. > > MR life time is under the control of the remote side, and time to > complete the network exchanges required to release the MRs is hard to > bound. So even if I implement SIGIO properly my app will still likely > have random QP failures under various cases and work loads. :( > > This is why ODP should be the focus because this cannot work fully > reliably otherwise.. The lease break time is configurable. If that application can't respond to a stop request within a timeout of its own choosing then it should not be using DAX mappings. > >> > Perhaps you might want to initially only support ODP MR mappings with >> > DAX and then the DMA fencing issue goes away? >> >> I'd rather try to fix the non-ODP DAX case instead of just turning it off. > > Well, what about using SIGKILL if the lease-break-time hits? The > kernel will clean up the MRs when the process exits and this will > fence DMA to that memory. Can you point me to where the MR cleanup code fences DMA and quiesces the device? > But, still, if you really want to be fined graned, then I think > invalidating the impacted MR's is a better solution for RDMA than > trying to do it with the IOMMU... If there's a better routine for handling ib_umem_lease_break() I'd love to use it. Right now I'm reaching for the only tool I know for kernel enforced revocation of DMA access. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html