On Wed, Apr 14, 2021 at 12:50:52PM +0100, Matthew Wilcox wrote: > On Wed, Apr 14, 2021 at 10:10:44AM +0200, Jesper Dangaard Brouer wrote: > > Yes, indeed! - And very frustrating. It's keeping me up at night. > > I'm dreaming about 32 vs 64 bit data structures. My fitbit stats tell > > me that I don't sleep well with these kind of dreams ;-) > > Then you're going to love this ... even with the latest patch, there's > still a problem. Because dma_addr_t is still 64-bit aligned _as a type_, > that forces the union to be 64-bit aligned (as we already knew and worked > around), but what I'd forgotten is that forces the entirety of struct > page to be 64-bit aligned. Which means ... > > /* size: 40, cachelines: 1, members: 4 */ > /* padding: 4 */ > /* forced alignments: 1 */ > /* last cacheline: 40 bytes */ > } __attribute__((__aligned__(8))); > > .. that we still have a hole! It's just moved from being at offset 4 > to being at offset 36. > > > That said, I think we need to have a quicker fix for the immediate > > issue with 64-bit bit dma_addr on 32-bit arch and the misalignment hole > > it leaves[3] in struct page. In[3] you mention ppc32, does it only > > happens on certain 32-bit archs? > > AFAICT it happens on mips32, ppc32, arm32 and arc. It doesn't happen > on x86-32 because dma_addr_t is 32-bit aligned. > > Doing this fixes it: > > +++ b/include/linux/types.h > @@ -140,7 +140,7 @@ typedef u64 blkcnt_t; > * so they don't care about the size of the actual bus addresses. > */ > #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT > -typedef u64 dma_addr_t; > +typedef u64 __attribute__((aligned(sizeof(void *)))) dma_addr_t; > #else > typedef u32 dma_addr_t; > #endif > > > I'm seriously considering removing page_pool's support for doing/keeping > > DMA-mappings on 32-bit arch's. AFAIK only a single driver use this. > > ... if you're going to do that, then we don't need to do this. FWIW I already proposed that to Matthew in private a few days ago... II am not even sure the AM572x has that support. I'd much prefer getting rid of it as well, instead of overcomplicating the struct for a device noone is going to need. Cheers /Ilias