On Tue, 12 Apr 2022 at 14:31, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > On Tue, Apr 12, 2022 at 06:18:46PM +0800, Herbert Xu wrote: > > On Tue, Apr 12, 2022 at 11:02:54AM +0100, Catalin Marinas wrote: > > > This series does not penalise any architecture. It doesn't even make > > > arm64 any worse than it currently is. > > > > Right, the patch as it stands doesn't change anything. However, > > it is also broken as it stands. As I said before, CRYPTO_MINALIGN > > is not something that is guaranteed by the Crypto API, it is simply > > a statement of whatever kmalloc returns. > > I agree that CRYPTO_MINALIGN is not guaranteed by the Crypto API. What > I'm debating is the intended use for CRYPTO_MINALIGN in some (most?) of > the drivers. It's not just about kmalloc() but also a build-time offset > of buffers within structures to guarantee DMA safety. This can't be > fixed by cra_alignmask. > > We could leave CRYPTO_MINALIGN as ARCH_KMALLOC_MINALIGN and that matches > it being just a statement of the kmalloc() minimum alignment. But since > it is also overloaded with the DMA in-structure offset alignment, we'd > need a new CRYPTO_DMA_MINALIGN (and _ATTR) to annotate those structures. > I have a suspicion there'll be fewer of the original CRYPTO_MINALIGN > uses left, hence my approach to making this bigger from the start. > > There's also Ard's series introducing CRYPTO_REQ_MINALIGN while leaving > CRYPT_MINALIGN for DMA-safe offsets (IIUC): > > https://lore.kernel.org/r/20220406142715.2270256-1-ardb@xxxxxxxxxx > > > So if kmalloc is no longer returning CRYPTO_MINALIGN-aligned > > memory, then those drivers that need this alignment for DMA > > will break anyway. > One thing to note here is that minimum DMA *alignment* is not the same as the padding to cache writeback granule (CWG) that is needed when dealing with non-cache coherent inbound DMA. The former is typically a property of the peripheral IP, and is something that the driver needs to deal with, potentially by setting cra_alignmask to ensure that the input and output buffers are placed such that they can accessed via DMA by the peripheral. The latter is a property of the CPU's cache hierarchy, not only the size of the CWG, but also whether or not DMA is cache coherent to begin with. This is not something the driver should usually have to care about if it uses the DMA API correctly. The reason why CRYPTO_MINALIGN matters for DMA in spite of this is that some drivers not only use DMA for processing the bulk of the data (typically presented using scatterlists) but sometimes also use DMA to map parts of the request and TFM structures, which carry control data used by the CPU to manage the crypto requests. Doing a non-coherent DMA write into such a structure may blow away 64 or 128 bytes of data, even if the write itself is much smaller, due to the fact that we need to perform cache invalidation in order for the CPU to be able to observe what the device wrote to that memory, and the invalidated cache lines may be shared with other data items, and may become dirty while the DMA mapping is active. This is what I am addressing with my patch series, i.e., padding out the driver owned parts of the struct to the CWG size so that cache invalidation does not affect data owned by other layers in the crypto cake, but only at runtime. By doing this consistently for TFM and request structures, we should be able to disregard ARCH_DMA_MINALIGN entirely when it comes to defining CRYPTO_MINALIGN, as it is highly unlikely that any of these peripherals would require more than 8 or 16 bytes of alignment for the DMA operations themselves. > No. As per one of my previous emails, kmalloc() will preserve the DMA > alignment for an SoC even if smaller than CRYPTO_MINALIGN (or a new > CRYPTO_DMA_MINALIGN). Since kmalloc() returns DMA-safe pointers and > CRYPTO_MINALIGN (or a new CRYPTO_DMA_MINALIGN) is DMA-safe, so would an > offset from a pointer returned by kmalloc(). > > > If you want the Crypto API to guarantee alignment over and above > > that returned by kmalloc, the correct way is to use cra_alignmask. > > For kmalloc(), this would work, but for the current CRYPTO_MINALIGN_ATTR > uses it won't. > > Thanks. > > -- > Catalin