Re: [PATCH 0/9] Input: Fix insufficent DMA alignment.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 28 Nov 2022 10:16:31 -0800
Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx> wrote:

> Hi Jonathan,
> 
> On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote:
> > From: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > 
> > This problem was discovered in IIO as a side effect of the discussions about
> > relaxing kmalloc alignment on arm64 and resulted in a series of large
> > patch sets.
> > 
> > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@xxxxxxxxxx/
> > 
> > Unsurprisingly there are cases of it in other subsystems.
> > 
> > The short version of this is that there are a few known arm64 chips where
> > ___cacheline_aligned enforces 64 byte alignment which is what we typically
> > want for performance optimization as the size of the L1 cache lines.
> > However, further out in the cache hierarchy we have caches with 128 byte
> > lines.  Those are the ones that matter for DMA safety.
> > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which
> > in this case is 128 bytes.  
> 
> I wonder if we could have something like ____dmasafe_aligned instead of
> sprinkling ARCH_KMALLOC_MINALIGN around?

I agree in principle and eventually that will be ARCH_DMA_MINALIGN.
But it isn't useable yet for backports.
Catalin has worked on several approaches to reducing the alignment of kmalloc.
I may well be lagging on the current plan...
https://lore.kernel.org/linux-arm-kernel/20221106220143.2129263-1-catalin.marinas@xxxxxxx/#r

The way crypto and IIO handled this was to add a local define
IIO_DMA_MINALIGN, CRYPTO_MINALIGN

Catalin, if you do forwards with making ARCH_DMA_MINALIGN global available, make sure
to include IIO_DMA_MINALIGN.  I'm fine with the approach you used for crypto of changing
the local define.  Note that in IIO this typically only bloats a single structure
per device instance, so it's not worth the complexity of dynamic alignment.
As we are marking structure elements with it, in all cases they are multiples of
IIO_DMA_MINALIGN in size, so everything is easy.

On the other side of things, we might be able to relax most of these alignment tricks
in IIO (and input) if swiotlb is the approach eventually used.  I'm personally not that
keen on that transition but meh, this stuff usually involves a serial bus, so the extra
bounce isnt' too painful. Note that we regularly do fun things like setting tx
buffer to be ARCH_KMALLOC_MINALIGN and the rx buffer to be that +n bytes on the basis
that a given serial bus master won't trash itself and hence this is safe. I think your
approach will bounce all of those which is somewhat unfortunate.  Other fun things
include DMA mapping a series of small buffers in one allocation for a series of serial
messages.  Again, that is guaranteed to be safe.

Jonathan






> 
> > 
> > There is one other use of ____cacheline_aligned in input:
> > joystick/iforce/iforce-usb.c
> > 
> > Whilst suspicious I'm not sure enough of the requirements of USB to
> > know if they are there for DMA safety or some other constraint.  
> 
> Yes, USB has requirements similar to SPI.
> 
> Thanks.
> 




[Index of Archives]     [Linux Media Devel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Linux Wireless Networking]     [Linux Omap]

  Powered by Linux