On Wed, Aug 5, 2020 at 3:44 AM Samuel Holland <samuel@xxxxxxxxxxxx> wrote: > On 8/3/20 9:02 AM, Arnd Bergmann wrote: > > On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland <samuel@xxxxxxxxxxxx> wrote: > >> All of the command structures are packed, due to the "#pragma pack(1)" earlier > >> in the file. So alignment is not an issue. This dma_addr_t member _is_ the > >> explicit padding to make sizeof(TW_Command) - > >> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And > >> indeed the structure is expected to be a different size depending on > >> sizeof(dma_addr_t). > > > > Ah, so only the first few members are accessed by hardware and the > > last union is only accessed by the OS then? In that case I suppose it is > > all fine, but I would also suggest removing the "#pragma packed" > > to get somewhat more efficient access on systems that have problems > > with misaligned accesses. > > I don't know what part the hardware accesses; everything I know about the > hardware comes from reading the driver. I see now from your explanation below that this is a hardware-defined structure. I was confused by how it can be either 32 or 64 bits wide but found the tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0; line now that tells the hardware about which format is used. > The problem with removing the "#pragma pack(1)" is that the structure is > inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may begin > with a __le64. I think a fairly clean way to handle this would be to remove the pragma and instead define a local type like #if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) typedef __le64 twa_address_t __packed; #else typedef __le32 twa_addr_t; #endif The problem with marking the entire structure as packed, rather than just individual members is that you end up with very inefficient bytewise access on some architectures (especially those without cache-coherent DMA or hardware unaligned access in the CPU), so I would recommend avoiding that in portable driver code. Arnd