On Fri, 15 Apr 2022 at 12:45, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, Apr 15, 2022 at 12:22:27PM +0200, Ard Biesheuvel wrote: > > > > Subsequent objects are owned by the driver, and it is the > > responsibility of the driver not to modify the fields while it is also > > mapped for DMA (and we have had issues in the past where drivers > > violated this rule). So as long as ARCH_KMALLOC_ALIGN guarantees > > actual DMA minimum alignment for both the start and the end, we > > shouldn't need any explicit padding at the end. > > I don't understand why this is guaranteed. The driver context > size is arbitrary so it could end in the middle of a cacheline. > The slab allocator could well lay it out so that the next kmalloc > object starts right after the end of the context, in which case > they would share a cache-line. > If this is the case, things are already broken today. We never take ARCH_DMA_MINALIGN into account when adding the driver ctx size to the overall allocation size. > The next kmalloc object could be (and in fact is likely to be) > of the same type. > > Previously this wasn't possible because kmalloc guaranteed > alignment. > Either it does or it doesn't. If kmalloc() guarantees the actual DMA alignment at both ends, the situation you describe cannot occur, given that the driver's slice of the request/TFM structure would be padded up to actual DMA alignment, in spite of whether or not ARCH_DMA_MINALIGN exceeds that. So it would never share a cacheline in practice, even though they might live in the same 128 byte aligned region on a system that has a minimum DMA alignment that is lower than that. If kmalloc() does not guarantee that the end of the buffer is aligned to actual DMA alignment, things are already broken today.