On 27/11/2023 17:23, David Brown wrote:
On 27/11/2023 16:16, Ed Robbins via Gcc-help wrote:
Hello,
I am using gcc-arm-none-eabi with a cortex M7 device, with caches
(data/instruction) enabled. Nested function calls result in a usage fault
because there is no clear cache call for this platform.
I am not sure I understand you here. Are you talking about trying to
use gcc nested function extensions, implemented by trampolines (small
function stubs on the stack)? If so, then the simple answer is - don't
do that. It's a /really/ bad idea. As far as I understand it, these
are a left-over from the way nested functions were originally
implemented in other gcc languages (Pascal, Ada, Modula-2), which now
handle things differently and far more efficiently. Trampolines were a
convenient way to implement nested functions some 30 years ago, before
caches were the norm, before anyone thought about security, before
processors had prefetching, and before people realised what an
appallingly bad idea self-modifying code is.
If you want to use nested functions, use a language that supports nested
functions, such as Ada, or use C++ with lambdas (which are a bit like
nested functions only much better).
Is there a way to provide the required functions without rebuilding
gcc? I
have been looking at the source and, as far as I can tell, there is not.
I can think of at least four ways :
1. The SDK for your microcontroller, provided by the manufacturer, will
have headers with cache clear functions.
2. The ARM CMSIS headers - also available from your manufacturer - has
intrinsic functions, including cache clear functions.
3. gcc has a generic "__buitin__clear_cache" function :
<https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005f_005f_005fclear_005fcache>
4. gcc supports the "ARM C Language Extensions", which include cache
control intrinsics:
<https://gcc.gnu.org/onlinedocs/gcc/ARM-C-Language-Extensions-_0028ACLE_0029.html>
<https://developer.arm.com/documentation/ihi0053/latest/>
I completely agree with David's comments about nested functions. Don't
do it!
Cleaning the D-caches from user-level on Arm is practically impossible
if there is no "OS" support; flushing the I-cache is equally difficult.
This includes m-profile devices with secure and non-secure code, where
only secure code can execute the cache management operations. The same
is true for some, if not all, a-profile devices as well.
Looking at the compiler sources the __clear_cache builtin is only
implemented for Linux and even there it calls the kernel to do the work.
ACLE does not define a clear cache intrinsic operation (as far as I can
see). It does provide some of the primitives needed for a cache clear,
such as __dmb() and __isb(), but on their own, these are not enough.
CMSIS does appear to provide some primitives (SCB_CleanDCache_by_Addr
and SCB_InvalidateICache_by_Addr), but these will directly invoke the
relevent secure-mode primitives. If you want them in non-secure mode,
you'll need to export a suitable API from your secure code and then
arrange to use that. The compiler knows nothing about CMSIS, so this
isn't much help for trampolines, I'm afraid.
Microcontroller SDK's are almost certain to face similar issues, since
the root issue is the same: you can't do this from non-secure mode.
R.
But there also doesn't look to be a clean way to implement this: It
appears
that this is done on an operating system basis, and when running bare
metal
it is not clear where the code would live.
There is no "clean" way to handle the appropriate cache invalidation,
because there is no clean way to get the addresses you need for
invalidating the instruction cache. (Cleanly invalidating the
instruction cache for other purposes, such as during firmware upgrades,
is no problem.)
There are also at least two approaches to solve it, I guess:
1. Somehow indicate on the command line (via target or a dedicated
option)
to emit the clear cache call for cortex M, and I guess that the function
itself should do nothing if both caches are disabled.
2. Define hooks or provide a command line option so that developers can
provide an implementation for their platform?
Assuming I were to do this the improper way (and just create a build that
works only for my particular target): Where should I define
CLEAR_INSN_CACHE?
I am not sure if there is already a way to do all this that I am just
unaware of?
Seriously - don't use nested functions in C. Even if you get them
working, it would be painfully inefficient. You'd have to flush parts
of the data cache (to make sure the stack data is written out to main
memory), taking time. You'd then have to invalidate the relevant parts
of the instruction cache. (Even calculating what parts of these caches
to clear will take time and effort.) Then everything needs to be read
into the caches again to actually execute the function.
And what's the point? So that you can write :
void foo(...) {
int bar(...) {
...
}
bar();
}
instead of
static int foo_bar(...) {
...
}
void foo(...) {
foo_bar();
}
or
void foo(...) {
auto bar = [](...) {
...
}
bar();
}
?