On Mon, Mar 18, 2024 at 02:29:23PM +0000, Jonathan Cameron wrote: > On Mon, 18 Mar 2024 15:09:32 +0200 > Andy Shevchenko <andy.shevchenko@xxxxxxxxx> wrote: > > > On Mon, Mar 18, 2024 at 2:41 PM Jonathan Cameron > > <Jonathan.Cameron@xxxxxxxxxx> wrote: > > > > > struct ad7944_adc { > > > > > struct spi_device *spi; > > > > > + enum ad7944_spi_mode spi_mode; > > > > > /* Chip-specific timing specifications. */ > > > > > const struct ad7944_timing_spec *timing_spec; > > > > > /* GPIO connected to CNV pin. */ > > > > > @@ -58,6 +75,9 @@ struct ad7944_adc { > > > > > } sample __aligned(IIO_DMA_MINALIGN); > > > > > }; > > > > > > > > Have you run `pahole` to see if there is a better place for a new member? > > > > > > I know this matters for structures where we see lots of them, but do we actually > > > care for one offs? Whilst it doesn't matter here I'd focus much more > > > on readability and like parameter grouping for cases like this than wasting > > > a few bytes. > > > > This is _also_ true, but think more about cache line contamination. > > Even not-so-important bytes may decrease the performance. In some > > cases it's tolerable, in some it is not (high-speed ADC). In general I > > assume that the developer has to understand many aspects of the > > software and cache line contamination may be last but definitely not > > least. > > > > Not totally sure what you are covering with contamination as many aspects > around caches and that's not really a standard term for any of them (as > far as I know). > > It's part of a multi cacheline allocation anyway (because it's tacked on the > end of the iio device struct, so fairly unlikely to share with other allocations > and definitely not on ARM because of the trailing __aligned(IIO_DMA_MINALIGN) > elements. > > If it matters more locally, then pahole is more likely to push you to pack You mean 'pahole --reorganize', right? Yeah, I need to take into account explicit __attribute__((__aligned__)) at the start of cachelines as a hint that the fields in a cacheline can't be moved outside of that cacheline or plain leave that cacheline members alone, as-is. I also need to get perf's data-type profiling as an input for 'pahole --reorganize', with that we may take into account the existing __aligned__ markings and combine it with what we get from data-type profiling. - Arnaldo > things together in a fashion that makes false sharing and similar perf issues > more likely if you are grouping things for packing purposes rather than > logical groups. > > If you just mean cache pressure then fair enough if we squeeze everything into > one cacheline and that doesn't cause false sharing. > 'Maybe' this will fit on x86. On Arm64 it's not going to > make any difference, just moving the padding around a bit within the line. > > So I'd argue premature optimization for a small, one off, structure. > > Jonathan > > >