Re: Order of variables in specific sections when enabling optimization in gcc

Freddie Chopin <freddie_chopin@xxxxx> · Thu, 07 Mar 2019 16:51:02 +0100

On Thu, 2019-03-07 at 15:18 +0100, David Brown wrote:
> On 07/03/2019 14:56, Freddie Chopin wrote:
> > On Thu, 2019-03-07 at 14:05 +0100, David Brown wrote:
> > > The -fno-toplevel-reorder switch can be handy too - it will stop
> > > re-ordering within a translation unit.
> > 
> > Great - I'll try that soon (; This seems to be what I was looking
> > for!
> > 
> 
> I'd recommend putting it as:
> 
> 	#pragma GCC optimize ("-fno-toplevel-reorder")
> 
> in the source code defining your data.  That way it should work no
> matter what switches are used.  (I hope that option is allowed in the
> pragma - not all options are.)

Unfortunately GCC says this option is not valid for a pragma... But
compiling the file with this option does indeed result in identical
order in both source and object file (;

But the #pragma gave me an idea - I can actually disable the
optimizations with the pragma and this does work too:

#pragma GCC optimize ("0")

> > > However, if you are using LTO or -fdata-sections and the
> > > --gc-sections linker option, variables that are not needed get
> > > eliminated.  (Note that this could happen even if they are
> > > actually
> > > used, if the compiler can figure out that the storage is not
> > > needed.)
> > 
> > --gc-sections and -fdata-sections does not affect variables for
> > which I
> > explicity set the section. They are not removed, even if not used
> > (I'm
> > not using LTO).
> > 
> 
> -fdata-sections won't affect your explicit sections.  (This is a
> setting
> that is often used in embedded systems, as a way of minimising sizes,
> but it actually adds significantly to code size and run-time on
> devices
> like ARM Cortex.)

This may be a bit off-topic here, but the firmware I'm working on (for
ARM Cortex-M4 chip) is:
- 205232 text, 6476 data and 83512 bss _WITH_ -fdata-sections;
- 207652 text, 6572 data and 83512 bss _WITHOUT_ -fdata-sections;

The first version is smaller both for flash and RAM - the difference is
not significant (~1%), but there's no flash vs. RAM trade-off. Maybe
the whole thing is a bit slower, but I wouldn't be so sure about that.

> Yes, it would be.  But it's very easy to accidentally mess up your
> variables when trying to add new ones or change existing ones.

Yes, understandable.

> A particular benefit I find with the struct solution is that you can
> use
> _Static_assert to check that the offsets of the different parts are
> as
> you expect them to be.  When you change things - replacing padding
> and
> "reserved" space with real variables - you will be glad of this extra
> check.
> 
> Another thing you can do with struct's is to have multiple struct
> types
> - you can have "struct params_v1" now, and later have "struct
> params_v2"
> for a new version.  You can use pointers to these types to read off
> the
> "param structure version number" item and then update the old
> structure
> to the new one when you first run the new software.  This can be a
> lot
> harder when the variables are defined independently.
> 
> The struct method also makes it vastly easier to have multiple sets
> of
> parameters - perhaps a factory default set in flash.  Reset to
> default
> then becomes a nice memcpy from an initialised const struct.  Doing
> this
> with individual items in a special section in ram means duplicating
> these items as const items within a special section in flash - it's a
> maintenance problem waiting to happen.  And there is no equivalent of
> "-Wmissing-field-initializers" to help you spot your bugs.

This is all fine as long as the use of such variables is very similar -
for example ONLY as device configuration. The moment you start using
them for completely different things then all the advantages you listed
above (except checks with static assertions) are gone. If you have 10
objects as device configuration, 10 objects as "persistent scratch-pad" 
(for logging information about hard crashes, faults and asserts) and
another 10 as factory-only configuration, then there really is no
advantage in keeping them so closely coupled together...

Regards,
FCh