On Thu, Dec 09, 2021 at 09:31:49AM +0100, Geert Uytterhoeven wrote: > Hi Gabriel, > > On Wed, Dec 8, 2021 at 9:14 PM Gabriel L. Somlo <gsomlo@xxxxxxxxx> wrote: > > I did *some* of this for v3, but since figured out how to use `pahole` :) > > Right, pahole. > > > On Mon, Dec 06, 2021 at 11:07:56AM +0100, Geert Uytterhoeven wrote: > > > > +struct litex_mmc_host { > > > > + struct mmc_host *mmc; > > > > + struct platform_device *dev; > > > > + > > > > + void __iomem *sdphy; > > > > + void __iomem *sdcore; > > > > + void __iomem *sdreader; > > > > + void __iomem *sdwriter; > > > > + void __iomem *sdirq; > > > > + > > > > + u32 resp[4]; > > > > + u16 rca; > > > > + > > > > + void *buffer; > > > > + size_t buf_size; > > > > + dma_addr_t dma; > > > > + > > > > + unsigned int freq; > > > > + unsigned int clock; > > > > + bool is_bus_width_set; > > > > + bool app_cmd; > > > > + > > > > + int irq; > > > > + struct completion cmd_done; > > > > > > You may want to reorder the members to avoid implicit gaps > > > (i.e. structs first, followed by integral types in decreasing size). > > > > So, for v4, I'll have it looking like this, which `pahole` says is > > optimally packed: > > > > struct litex_mmc_host { > > struct mmc_host * mmc; /* 0 8 */ > > struct platform_device * dev; /* 8 8 */ > > void * sdphy; /* 16 8 */ > > void * sdcore; /* 24 8 */ > > void * sdreader; /* 32 8 */ > > void * sdwriter; /* 40 8 */ > > void * sdirq; /* 48 8 */ > > void * buffer; /* 56 8 */ > > /* --- cacheline 1 boundary (64 bytes) --- */ > > size_t buf_size; /* 64 8 */ > > size_t is 32-bit on RV32, so you may want to move it below cmd_done. > > > dma_addr_t dma; /* 72 8 */ > > struct completion cmd_done; /* 80 32 */ > > int irq; /* 112 4 */ > > unsigned int ref_clk; /* 116 4 */ > > unsigned int sd_clk; /* 120 4 */ > > u32 resp[4]; /* 124 16 */ > > /* --- cacheline 2 boundary (128 bytes) was 12 bytes ago --- */ > > u16 rca; /* 140 2 */ > > bool is_bus_width_set; /* 142 1 */ > > bool app_cmd; /* 143 1 */ > > > > /* size: 144, cachelines: 3, members: 18 */ > > /* last cacheline: 16 bytes */ > > }; After a bit of a fight, I managed to wrestle `pahole` to display useful information for 32-bit (rv32imac) builds: struct litex_mmc_host { struct mmc_host * mmc; /* 0 4 */ struct platform_device * dev; /* 4 4 */ void * sdphy; /* 8 4 */ void * sdcore; /* 12 4 */ void * sdreader; /* 16 4 */ void * sdwriter; /* 20 4 */ void * sdirq; /* 24 4 */ void * buffer; /* 28 4 */ size_t buf_size; /* 32 4 */ dma_addr_t dma; /* 36 4 */ struct completion cmd_done; /* 40 16 */ int irq; /* 56 4 */ unsigned int ref_clk; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ unsigned int sd_clk; /* 64 4 */ u32 resp[4]; /* 68 16 */ u16 rca; /* 84 2 */ bool is_bus_width_set; /* 86 1 */ bool app_cmd; /* 87 1 */ /* size: 88, cachelines: 2, members: 18 */ /* last cacheline: 24 bytes */ }; Looks like even with `size_t buf_size` where it is right now, there still are no holes. I like it where it is, as it's related to the field immediately preceding it (`buffer`). I'd rather not move it, particularly since we're not actually eliminating any additional holes. What do you think (i.e., is there a configuration where there's still a chance we may run into trouble)? Thanks, --Gabriel