On Fri, Apr 08, 2022 at 05:11:28PM +0800, Herbert Xu wrote: > On Fri, Apr 08, 2022 at 10:04:54AM +0100, Catalin Marinas wrote: > > My point is that if the crypto code kmallocs a size aligned to > > crypto_tfm_ctx_alignment() (and CRYPTO_MINALIGN), the slab allocator > > will return memory aligned to CRYPTO_MINALIGN even if > > ARCH_KMALLOC_MINALIGN is smaller. > > No we don't align the size to CRYPTO_MINALIGN at all. We simply > assume that this is the alignment returned by kmalloc. > > > Would the crypto code, say, do a kmalloc(64) and expect a 128 byte > > alignment (when CRYPTO_MINALIGN == 128)? Or does it align the size to > > CRYPTO_MINALIGN and do a kmalloc(128) directly? If it's the latter, I > > don't think there's a problem. > > It's the former. Does this only matter for DMA? If there other unrelated alignment requirements, I think those drivers should be fixed for their own cra_alignmask independent of the CPU cache line and DMA needs (e.g. some 1024-bit vectors that would benefit from an aligned load). With this series, the dynamic arch_kmalloc_minalign() still provides the DMA-safe alignment even if it would be smaller than the default CRYPTO_MINALIGN of 128. Let's say we have a structure: struct crypto_something { u8 some_field; u8 data[] CRYPTO_MINALIGN_ATTR; }; The sizeof(struct crypto_something) is automatically a multiple of CRYPTO_MINALIGN (128 bytes for arm64). While kmalloc() could return a smaller alignment, arch_kmalloc_minalign(), the data pointer above is still aligned to arch_kmalloc_minalign() and DMA-safe since CRYPTO_MINALIGN is a multiple of this (similar to the struct devres case, https://lore.kernel.org/all/YlRn2Wal4ezjvomZ@xxxxxxx/). Even if such struct is included in another struct, the alignment and sizeof is inherited by the containing object. So let's assume the driver needs 64 bytes of data in addition to the struct, it would allocate: kmalloc(sizeof(struct crypto_something) + 64); Prior to this series, kmalloc() would return a 256-byte aligned pointer. With this series, if the cache line size on a SoC is 64-byte, the allocation falls under the kmalloc-192 cache, so 'data' would be 64-byte aligned which is safe for DMA. > I think you can still make the change you want, but first you need > to modify the affected drivers to specify their actual alignment > requirement explicitly through cra_alignmask and then use the > correct methods to access the context pointer. At a quick grep, most cra_alignmask values are currently 15 or smaller. I'm not convinced the driver needs to know about the CPU cache alignment. We could set cra_alignmask to CRYPTO_MINALIGN but that would incur unnecessary overhead via function like setkey_unaligned() when the arch_kmalloc_minalign() was already sufficient for DMA safety. Maybe I miss some use-cases or I focus too much on DMA safety. Thanks. -- Catalin