Hi, My intention is to use coroutine syntax for programming AVR MCU. It looks promising to be able to write synchronous-style code in the presence of interruption driven architecture. However, I am facing the following (well-known, I suppose) difficulties related to the coroutines memory management on embedded platforms. There are two workarounds well described in multiple articles over the internet. The first is to override `operator new` and place the coroutine state into some dedicated place in memory. Unfortunately, the coroutine state size is not known at compile time, so this way will eventually lead to overflow with possible memory corruption in runtime. The second is to rely on the heap allocation elision optimization (aka HALO), when compiler places the coroutine state into the current function frame instead. However, the rules how to force the compiler to do that are not completely clear to me. I feel that this kind of optimization never actually work for co_await operator in gcc 11. The question is the following. Currently, I ended with linker-time error due to missed operator new. Is there a way to debug (some compile options?) what prevents the compiler to place the coroutine state into the main() frame?