> On Apr 15, 2017, at 5:55 AM, Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > On Fri, Apr 14, 2017 at 11:51:39AM -0400, Chuck Lever wrote: >> Howdy- >> >> I recently found a way to crash my HCA (and the whole system) using a >> signal on an NFS/RDMA mount point that is using FMR. I've documented >> the issue: >> >> https://bugzilla.linux-nfs.org/show_bug.cgi?id=305 >> >> And I have an NFS/RDMA fix I'm testing for v4.13. The fix is to prevent >> simultaneous calls to ib_unmap_fmr with the same FMR. >> >> While working on the fix, I've been looking for any documentation >> regarding serialization requirements for ib_unmap_fmr. Knut Omang pointed >> out to me that Documentation/infiniband/core-locking.txt makes this bold >> statement: >> >>> Reentrancy >>> >>> All of the methods in struct ib_device exported by a low-level >>> driver must be fully reentrant. The low-level driver is required to >>> perform all synchronization necessary to maintain consistency, even >>> if multiple function calls using the same object are run >>> simultaneously. >>> >>> The IB midlayer does not perform any serialization of function calls. >>> >>> Because low-level drivers are reentrant, upper level protocol >>> consumers are not required to perform any serialization. >> >> Does this re-entrancy guarantee apply only when ib_unmap_fmr is called >> concurrently with unique FMRs? > > According to description, it should apply to all operations on ib_device > without any exclusion. > >> >> I've been told it is not possible for ib_unmap_fmr to detect when it has >> been invoked in different threads with the same FMR. > > Right, FMR management is implemented as direct writes to MPT and MTT > tables. HW doesn't distinguish simultaneous calls to the TPT cache. > >> but apparently the > user space equivalent does not have the same >> vulnerability (I did not test this assertion). >> >> I'm wondering what is proper closure here (aside from merging the >> NFS/RDMA fix). > > Maybe serialize unmap_frm (workqueue) from the driver side? Either correcting the documentation or a driver change is OK with me. Claiming that "upper level protocol consumers are not required to perform any serialization" seems like a stretch. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html