Re: question regarding crypto driver DMA issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 8 Jul 2020 at 16:35, Van Leeuwen, Pascal <pvanleeuwen@xxxxxxxxxx> wrote:
>
> Hi Ard,
>
> Thanks for responding!
>
> > > For the situation where this problem is occuring, the actual buffers are stored inside
> > > the ahash_req structure. So my question is: is there any reason why this structure may
> > > not be DMA-able on some systems? (as I have a hunch that may be the problem ...)
> > >
> >
> > If DMA is non-coherent, and the ahash_req structure is also modified
> > by the CPU while it is mapped for DMA, you are likely to get a
> > conflict.
> >
> Ah ... I get it. If I dma_map TO_DEVICE then all relevant cachelines are flushed, then
> if the CPU accesses any other data sharing those cachelines, they get read back into
> the cache. Any subsequent access of the actual result will then read stale data from
> the cache.
>
> > It should help if you align the DMA-able fields sufficiently, and make
> > sure you never touch them while they are mapped for writing by the
> > device.
> >
> Yes, I guess that is the only way. I also toyed with the idea of using dedicated properly
> dma_alloc'ed buffers with pointers in the ahash_request structure, but I don't see how
> I can allocate per-request buffers as there is no callback to the driver on req creation.
>
> So ... is there any magical way within the Linux kernel to cacheline-align members of
> a structure? Considering cacheline size is very system-specific?
>

You can use __cacheline_aligned as a modifier on struct members that
are accessed by the device. However, this is a typical value, not a
worst case value, and since this is taken into account at compile
time, you really need a worst case value.

On arm64, the maximum CWG (Cache Writeback Granule) value is 2k, which
is a bit excessive, so it might help to do this at runtime. One thing
you might do is increase the reqsize at TFM init time (in which case
you could also check whether the device is cache coherent for DMA),
and have a helper that gives you the address of the sub-struct inside
the request struct based on the current cache alignment.



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux