Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 10, 2017 at 10:25 AM, Jason Gunthorpe
<jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Oct 09, 2017 at 12:28:29PM -0700, Dan Williams wrote:
>
>> > I don't think this has ever come up in the context of an all-device MR
>> > invalidate requirement. Drivers already have code to invalidate
>> > specifc MRs, but to find all MRs that touch certain pages and then
>> > invalidate them would be new code.
>> >
>> > We also have ODP aware drivers that can retarget a MR to new
>> > physical pages. If the block map changes DAX should synchronously
>> > retarget the ODP MR, not halt DMA.
>>
>> Have a look at the patch [1], I don't touch the ODP path.
>
> But, does ODP work OK already? I'm not clear on that..

It had better. If the mapping is invalidated I would hope that
generates an io fault that gets handled by the driver to setup the new
mapping. I don't see how it can work otherwise.

>> > Most likely ODP & DAX would need to be used together to get robust
>> > user applications, as having the user QP's go to an error state at
>> > random times (due to DMA failures) during operation is never going to
>> > be acceptable...
>>
>> It's not random. The process that set up the mapping and registered
>> the memory gets SIGIO when someone else tries to modify the file map.
>> That process then gets /proc/sys/fs/lease-break-time seconds to fix
>> the problem before the kernel force revokes the DMA access.
>
> Well, the process can't fix the problem in bounded time, so it is
> random if it will fail or not.
>
> MR life time is under the control of the remote side, and time to
> complete the network exchanges required to release the MRs is hard to
> bound. So even if I implement SIGIO properly my app will still likely
> have random QP failures under various cases and work loads. :(
>
> This is why ODP should be the focus because this cannot work fully
> reliably otherwise..

The lease break time is configurable. If that application can't
respond to a stop request within a timeout of its own choosing then it
should not be using DAX mappings.

>
>> > Perhaps you might want to initially only support ODP MR mappings with
>> > DAX and then the DMA fencing issue goes away?
>>
>> I'd rather try to fix the non-ODP DAX case instead of just turning it off.
>
> Well, what about using SIGKILL if the lease-break-time hits? The
> kernel will clean up the MRs when the process exits and this will
> fence DMA to that memory.

Can you point me to where the MR cleanup code fences DMA and quiesces
the device?

> But, still, if you really want to be fined graned, then I think
> invalidating the impacted MR's is a better solution for RDMA than
> trying to do it with the IOMMU...

If there's a better routine for handling ib_umem_lease_break() I'd
love to use it. Right now I'm reaching for the only tool I know for
kernel enforced revocation of DMA access.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux