From: Rustam Kovhaev > Sent: 30 November 2021 07:00 > > On Tue, Nov 23, 2021 at 10:18:27AM +0000, David Laight wrote: > > From: Vlastimil Babka > > > Sent: 22 November 2021 10:46 > > > > > > On 11/22/21 11:36, Christoph Lameter wrote: > > > > On Mon, 22 Nov 2021, Vlastimil Babka wrote: > > > > > > > >> But it seems there's no reason we couldn't do better? I.e. use the value of > > > >> SLOB_HDR_SIZE only to align the beginning of actual object (and name the > > > >> define different than SLOB_HDR_SIZE). But the size of the header, where we > > > >> store the object lenght could be just a native word - 4 bytes on 32bit, 8 on > > > >> 64bit. The address of the header shouldn't have a reason to be also aligned > > > >> to ARCH_KMALLOC_MINALIGN / ARCH_SLAB_MINALIGN as only SLOB itself processes > > > >> it and not the slab consumers which rely on those alignments? > > > > > > > > Well the best way would be to put it at the end of the object in order to > > > > avoid the alignment problem. This is a particular issue with SLOB because > > > > it allows multiple types of objects in a single page frame. > > > > ... > > > > > > So I guess placement at the beginning cannot be avoided. That in turn runs > > > > into trouble with the DMA requirements on some platforms where the > > > > beginning of the object has to be cache line aligned. > > > > > > It's no problem to have the real beginning of the object aligned, and the > > > prepended header not. > > > > I'm not sure that helps. > > The header can't share a cache line with the previous item (because it > > might be mapped for DMA) so will always take a full cache line. > > I thought that DMA API allocates buffers that are larger than page size. > DMA pool seems to be able to give out smaller buffers, but underneath it > seems to be calling page allocator. > The SLOB objects that have this header are all less than page size, and > they cannot end up in DMA code paths, or can they? The problem isn't dma_alloc_coherent() it is when memory allocated elsewhere is used for DMA. On systems with non-coherent DMA accesses the data cache has to be flushed before all and invalidated after read DMA transfers. The cpu must not dirty any of the cache lines associated with a read DMA. This is on top of any requirements for the alignment of the returned address. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)