Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 10, 2017 at 10:25 AM, Jason Gunthorpe
<jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote:
>
>> > I don't think this has ever come up in the context of an all-device MR
>> > invalidate requirement. Drivers already have code to invalidate
>> > specifc MRs, but to find all MRs that touch certain pages and then
>> > invalidate them would be new code.
>> >
>> > We also have ODP aware drivers that can retarget a MR to new
>> > physical pages. If the block map changes DAX should synchronously
>> > retarget the ODP MR, not halt DMA.
>>
>> Have a look at the patch [1], I don't touch the ODP path.
>
> But, does ODP work OK already? I'm not clear on that..

It had better. If the mapping is invalidated I would hope that
generates an io fault that gets handled by the driver to setup the new
mapping. I don't see how it can work otherwise.

>> > Most likely ODP & DAX would need to be used together to get robust
>> > user applications, as having the user QP's go to an error state at
>> > random times (due to DMA failures) during operation is never going to
>> > be acceptable...
>>
>> It's not random. The process that set up the mapping and registered
>> the memory gets SIGIO when someone else tries to modify the file map.
>> That process then gets /proc/sys/fs/lease-break-time seconds to fix
>> the problem before the kernel force revokes the DMA access.
>
> Well, the process can't fix the problem in bounded time, so it is
> random if it will fail or not.
>
> MR life time is under the control of the remote side, and time to
> complete the network exchanges required to release the MRs is hard to
> bound. So even if I implement SIGIO properly my app will still likely
> have random QP failures under various cases and work loads. :(
>
> This is why ODP should be the focus because this cannot work fully
> reliably otherwise..

The lease break time is configurable. If that application can't
respond to a stop request within a timeout of its own choosing then it
should not be using DAX mappings.

>
>> > Perhaps you might want to initially only support ODP MR mappings with
>> > DAX and then the DMA fencing issue goes away?
>>
>> I'd rather try to fix the non-ODP DAX case instead of just turning it off.
>
> Well, what about using SIGKILL if the lease-break-time hits? The
> kernel will clean up the MRs when the process exits and this will
> fence DMA to that memory.

Can you point me to where the MR cleanup code fences DMA and quiesces
the device?

> But, still, if you really want to be fined graned, then I think
> invalidating the impacted MR's is a better solution for RDMA than
> trying to do it with the IOMMU...

If there's a better routine for handling ib_umem_lease_break() I'd
love to use it. Right now I'm reaching for the only tool I know for
kernel enforced revocation of DMA access.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux