Re: arm-none-eabi, nested function trampolines and caching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 27/11/2023 17:23, David Brown wrote:
On 27/11/2023 16:16, Ed Robbins via Gcc-help wrote:
Hello,
I am using gcc-arm-none-eabi with a cortex M7 device, with caches
(data/instruction) enabled. Nested function calls result in a usage fault
because there is no clear cache call for this platform.


I am not sure I understand you here.  Are you talking about trying to use gcc nested function extensions, implemented by trampolines (small function stubs on the stack)?  If so, then the simple answer is - don't do that.  It's a /really/ bad idea.  As far as I understand it, these are a left-over from the way nested functions were originally implemented in other gcc languages (Pascal, Ada, Modula-2), which now handle things differently and far more efficiently.  Trampolines were a convenient way to implement nested functions some 30 years ago, before caches were the norm, before anyone thought about security, before processors had prefetching, and before people realised what an appallingly bad idea self-modifying code is.

If you want to use nested functions, use a language that supports nested functions, such as Ada, or use C++ with lambdas (which are a bit like nested functions only much better).

Is there a way to provide the required functions without rebuilding gcc? I
have been looking at the source and, as far as I can tell, there is not.

I can think of at least four ways :

1. The SDK for your microcontroller, provided by the manufacturer, will have headers with cache clear functions.

2. The ARM CMSIS headers - also available from your manufacturer - has intrinsic functions, including cache clear functions.

3. gcc has a generic "__buitin__clear_cache" function :
<https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005f_005f_005fclear_005fcache>

4. gcc supports the "ARM C Language Extensions", which include cache control intrinsics:

<https://gcc.gnu.org/onlinedocs/gcc/ARM-C-Language-Extensions-_0028ACLE_0029.html>
<https://developer.arm.com/documentation/ihi0053/latest/>


I completely agree with David's comments about nested functions. Don't do it!

Cleaning the D-caches from user-level on Arm is practically impossible if there is no "OS" support; flushing the I-cache is equally difficult. This includes m-profile devices with secure and non-secure code, where only secure code can execute the cache management operations. The same is true for some, if not all, a-profile devices as well.

Looking at the compiler sources the __clear_cache builtin is only implemented for Linux and even there it calls the kernel to do the work.

ACLE does not define a clear cache intrinsic operation (as far as I can see). It does provide some of the primitives needed for a cache clear, such as __dmb() and __isb(), but on their own, these are not enough.

CMSIS does appear to provide some primitives (SCB_CleanDCache_by_Addr and SCB_InvalidateICache_by_Addr), but these will directly invoke the relevent secure-mode primitives. If you want them in non-secure mode, you'll need to export a suitable API from your secure code and then arrange to use that. The compiler knows nothing about CMSIS, so this isn't much help for trampolines, I'm afraid.

Microcontroller SDK's are almost certain to face similar issues, since the root issue is the same: you can't do this from non-secure mode.

R.



But there also doesn't look to be a clean way to implement this: It appears that this is done on an operating system basis, and when running bare metal
it is not clear where the code would live.

There is no "clean" way to handle the appropriate cache invalidation, because there is no clean way to get the addresses you need for invalidating the instruction cache.  (Cleanly invalidating the instruction cache for other purposes, such as during firmware upgrades, is no problem.)


There are also at least two approaches to solve it, I guess:
1. Somehow indicate on the command line (via target or a dedicated option)
to emit the clear cache call for cortex M, and I guess that the function
itself should do nothing if both caches are disabled.
2. Define hooks or provide a command line option so that developers can
provide an implementation for their platform?

Assuming I were to do this the improper way (and just create a build that
works only for my particular target): Where should I define
CLEAR_INSN_CACHE?

I am not sure if there is already a way to do all this that I am just
unaware of?


Seriously - don't use nested functions in C.  Even if you get them working, it would be painfully inefficient.  You'd have to flush parts of the data cache (to make sure the stack data is written out to main memory), taking time.  You'd then have to invalidate the relevant parts of the instruction cache.  (Even calculating what parts of these caches to clear will take time and effort.)  Then everything needs to be read into the caches again to actually execute the function.

And what's the point?  So that you can write :

     void foo(...) {
         int bar(...) {
             ...
         }
         bar();
     }

instead of

     static int foo_bar(...) {
         ...
     }
     void foo(...) {
         foo_bar();
     }

or

     void foo(...) {
         auto bar = [](...) {
             ...
         }
         bar();
     }

?





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux