Hi Gabriel, On Thu, Dec 9, 2021 at 9:58 PM Gabriel L. Somlo <gsomlo@xxxxxxxxx> wrote: > On Thu, Dec 09, 2021 at 09:31:49AM +0100, Geert Uytterhoeven wrote: > > On Wed, Dec 8, 2021 at 9:14 PM Gabriel L. Somlo <gsomlo@xxxxxxxxx> wrote: > > > I did *some* of this for v3, but since figured out how to use `pahole` :) > > > > Right, pahole. > > > > > On Mon, Dec 06, 2021 at 11:07:56AM +0100, Geert Uytterhoeven wrote: > > > > > +struct litex_mmc_host { > > > > > + struct mmc_host *mmc; > > > > > + struct platform_device *dev; > > > > > + > > > > > + void __iomem *sdphy; > > > > > + void __iomem *sdcore; > > > > > + void __iomem *sdreader; > > > > > + void __iomem *sdwriter; > > > > > + void __iomem *sdirq; > > > > > + > > > > > + u32 resp[4]; > > > > > + u16 rca; > > > > > + > > > > > + void *buffer; > > > > > + size_t buf_size; > > > > > + dma_addr_t dma; > > > > > + > > > > > + unsigned int freq; > > > > > + unsigned int clock; > > > > > + bool is_bus_width_set; > > > > > + bool app_cmd; > > > > > + > > > > > + int irq; > > > > > + struct completion cmd_done; > > > > > > > > You may want to reorder the members to avoid implicit gaps > > > > (i.e. structs first, followed by integral types in decreasing size). > > > > > > So, for v4, I'll have it looking like this, which `pahole` says is > > > optimally packed: > > > > > > struct litex_mmc_host { > > > struct mmc_host * mmc; /* 0 8 */ > > > struct platform_device * dev; /* 8 8 */ > > > void * sdphy; /* 16 8 */ > > > void * sdcore; /* 24 8 */ > > > void * sdreader; /* 32 8 */ > > > void * sdwriter; /* 40 8 */ > > > void * sdirq; /* 48 8 */ > > > void * buffer; /* 56 8 */ > > > /* --- cacheline 1 boundary (64 bytes) --- */ > > > size_t buf_size; /* 64 8 */ > > > > size_t is 32-bit on RV32, so you may want to move it below cmd_done. > > > > > dma_addr_t dma; /* 72 8 */ > > > struct completion cmd_done; /* 80 32 */ > > > int irq; /* 112 4 */ > > > unsigned int ref_clk; /* 116 4 */ > > > unsigned int sd_clk; /* 120 4 */ > > > u32 resp[4]; /* 124 16 */ > > > /* --- cacheline 2 boundary (128 bytes) was 12 bytes ago --- */ > > > u16 rca; /* 140 2 */ > > > bool is_bus_width_set; /* 142 1 */ > > > bool app_cmd; /* 143 1 */ > > > > > > /* size: 144, cachelines: 3, members: 18 */ > > > /* last cacheline: 16 bytes */ > > > }; > > After a bit of a fight, I managed to wrestle `pahole` to display useful > information for 32-bit (rv32imac) builds: > > struct litex_mmc_host { > struct mmc_host * mmc; /* 0 4 */ > struct platform_device * dev; /* 4 4 */ > void * sdphy; /* 8 4 */ > void * sdcore; /* 12 4 */ > void * sdreader; /* 16 4 */ > void * sdwriter; /* 20 4 */ > void * sdirq; /* 24 4 */ > void * buffer; /* 28 4 */ > size_t buf_size; /* 32 4 */ > dma_addr_t dma; /* 36 4 */ > struct completion cmd_done; /* 40 16 */ > int irq; /* 56 4 */ > unsigned int ref_clk; /* 60 4 */ > /* --- cacheline 1 boundary (64 bytes) --- */ > unsigned int sd_clk; /* 64 4 */ > u32 resp[4]; /* 68 16 */ > u16 rca; /* 84 2 */ > bool is_bus_width_set; /* 86 1 */ > bool app_cmd; /* 87 1 */ > > /* size: 88, cachelines: 2, members: 18 */ > /* last cacheline: 24 bytes */ > }; > > Looks like even with `size_t buf_size` where it is right now, there > still are no holes. I like it where it is, as it's related to the Right, dma_addr_t is 32-bit, too. I'm just too used to LPAE on ARM ;-) > field immediately preceding it (`buffer`). I'd rather not move it, > particularly since we're not actually eliminating any additional > holes. Thanks, LGTM. > What do you think (i.e., is there a configuration where there's still > a chance we may run into trouble)? ICONFIG_PHYS_ADDR_T_64BIT=y/CONFIG_ARCH_DMA_ADDR_T_64BIT=y on 32-bit, but that doesn't seem to be supported under arch/risc/ yet. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds