On Thu, Oct 12, 2017 at 01:10:33PM -0700, Dan Williams wrote: > On Thu, Oct 12, 2017 at 11:27 AM, Jason Gunthorpe > <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote: > > > >> Also keep in mind that what triggers the lease break is another > >> application trying to write or punch holes in a file that is mapped > >> for RDMA. So, if the hardware can't handle the iommu mapping getting > >> invalidated asynchronously and the application can't react in the > >> lease break timeout period then the administrator should arrange for > >> the file to not be written or truncated while it is mapped. > > > > That makes sense, but why not return ENOSYS or something to the app > > trying to alter the file if the RDMA hardware can't support this > > instead of having the RDMA app deal with this lease break weirdness? > > That's where I started, an inode flag that said "hands off, this file > is busy", but Christoph pointed out that we should reuse the same > mechanisms that pnfs is using. The pnfs protection scheme uses file > leases, and once the kernel decides that a lease needs to be broken / > layout needs to be recalled there is no stopping it, only delaying. That was just a suggestion - the important statement is that a hands off flag is just a no-go. > However, chatting this over with a few more people I have an alternate > solution that effectively behaves the same as how non-ODP hardware > handles this case of hole punch / truncation today. So, today if this > scenario happens on a page-cache backed mapping, the file blocks are > unmapped and the RDMA continues into pinned pages that are no longer > part of the file. We can achieve the same thing with the iommu, just > re-target the I/O into memory that isn't part of the file. That way > hardware does not see I/O errors and the DAX data consistency model is > no worse than the page-cache case. Yikes. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html