Hi, Short version: How do I tell GCC that a C expression (containing an external function call) can be safely subjected to CSE and DCE, even across external function calls, *but* reads from global memory cannot be hoisted above it(s first occurrence)? Just putting the expression in an inline function marked with [[gnu::const]] / __attribute__((const)) seems to do the trick on current GCC versions, but it seems that the compiler doesn’t actually guarantee the no-hoisting part. Long version: I’m writing a C library (an OpenGL loader[1]) whose task is, basically, to provide a way to check, at runtime, if a feature of some kind is available on the system, and if so, make some functions implemented by that feature available to the program. In a mockup (full source code with usage example below), I have the macro `have_feature`, expanding to a boolean expression, and macros `func1` and `func2`, expanding to function expressions that can be called provided that `have_feature` has evaluated to a true value. The implementation is that `func1` and `func2` refer to function pointer variables `pfunc1` and `pfunc2`, and `have_feature` checks if the feature has been loaded, loads it (by calling an external function which sets `pfunc1` and `pfunc2` when successful) if it hasn’t, then returns a presence flag. This means that `have_feature` has a curious set of properties: 1. It *can* be duplicated as necessary; 2. It *can* be discarded if its value is unused; 3. It *can* be evaluated only once if it occurs multiple times inside a function, even if there is an external function call or some other compiler barrier between the two occurrences; 4. Indeed, it always evaluates to the same value during the lifetime of the program; 5. But it *must* remain above all loads from `pfunc1` and `pfunc2` that were conditional on its value in the program text. As far as I can tell, this behaviour exactly corresponds to a function marked with [[gnu::const]] / __attribute__((const)) with no arguments, except for property 4. Indeed, in all examples so far, putting the necessary expression into such an (inline) function seems to work. Still, GCC loves moving code and I’d rather not depend on it not doing it, so... what’s the right way to express this? My current attempt is to use an explicit compiler barrier: replace #define have_feature (get_have_feature()) (with get_have_feature being a [[gnu::const]] function) by #define have_feature (get_have_feature() && \ (__atomic_signal_fence(__ATOMIC_ACQUIRE), true)) But I have no idea if this is correct (and no way to check, since I’m unable to get GCC to perform the offending code movement in any case). Full mockup code (compiles itself when run as a shell script): ``` #if 0 gcc -std=c2x -DMAIN -g -O2 -c -o "${0%.c}.main.o" "$0" && gcc -std=c2x -DLOAD -g -O2 -c -o "${0%.c}.load.o" "$0" && gcc -g -O2 -o "${0%.c}" "${0%.c}.main.o" "${0%.c}.load.o" && objdump --disassemble=main "${0%.c}"; exit $? #endif #include <stdbool.h> #include <stdio.h> enum { LOADED = 1, PRESENT = 2 }; extern unsigned feature_state; extern void load_feature(void); static inline bool [[gnu::const]] get_have_feature(void) { if (!__builtin_expect(!!(feature_state & LOADED), true)) load_feature(); return !!(feature_state & PRESENT); } #define have_feature (get_have_feature()) extern void (*pfunc1)(void), (*pfunc2)(void); #define func1 (*pfunc1) #define func2 (*pfunc2) #ifdef MAIN int main(void) { if (have_feature) func1(); puts("middle"); if (have_feature) func2(); return 0; } #endif #ifdef LOAD static void func1impl(void) { puts("start"); } static void func2impl(void) { puts("end"); } unsigned feature_state /* = 0 */; void (*pfunc1)(void), (*pfunc2)(void); void load_feature(void) { /* imagine some dlopen() / dlsym() business here ... */ pfunc1 = &func1impl; pfunc2 = &func2impl; feature_state = LOADED | PRESENT; } #endif ``` Notes on solving the problem in a different way: * Can I redefine `func1` and `func2` to do something else? Yes. * Can I redefine `func1` and `func2` to load the function pointers by themselves? Yes in theory, but I’d rather not (performance). * Can I redefine `func1` and `func2` to include the feature check, something like `(have_feature ? *func1 : __builtin_unreachable())`? No, a single function can be provided by several independent features (e.g. glDebugMessageControl is provided both by GL_KHR_debug and by GL_VERSION_4_3), I just didn’t include this in the mockup for brevity. [1]: https://www.khronos.org/opengl/wiki/OpenGL_Loading_Library -- Thanks, Alex
Attachment:
signature.asc
Description: This is a digitally signed message part