CSE and DCE of not-quite-gnu::const functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Short version:

How do I tell GCC that a C expression (containing an external function
call) can be safely subjected to CSE and DCE, even across external
function calls, *but* reads from global memory cannot be hoisted above
it(s first occurrence)? Just putting the expression in an inline
function marked with [[gnu::const]] / __attribute__((const)) seems to
do the trick on current GCC versions, but it seems that the compiler
doesn’t actually guarantee the no-hoisting part.

Long version:

I’m writing a C library (an OpenGL loader[1]) whose task is, basically,
to provide a way to check, at runtime, if a feature of some kind is
available on the system, and if so, make some functions implemented by
that feature available to the program.

In a mockup (full source code with usage example below), I have the
macro `have_feature`, expanding to a boolean expression, and macros
`func1` and `func2`, expanding to function expressions that can be
called provided that `have_feature` has evaluated to a true value.

The implementation is that `func1` and `func2` refer to function
pointer variables `pfunc1` and `pfunc2`, and `have_feature` checks if
the feature has been loaded, loads it (by calling an external function
which sets `pfunc1` and `pfunc2` when successful) if it hasn’t, then
returns a presence flag.

This means that `have_feature` has a curious set of properties:

   1. It *can* be duplicated as necessary;
   2. It *can* be discarded if its value is unused;
   3. It *can* be evaluated only once if it occurs multiple times
      inside a function, even if there is an external function call or
      some other compiler barrier between the two occurrences;
   4. Indeed, it always evaluates to the same value during the lifetime
      of the program;
   5. But it *must* remain above all loads from `pfunc1` and `pfunc2`
      that were conditional on its value in the program text.

As far as I can tell, this behaviour exactly corresponds to a function
marked with [[gnu::const]] / __attribute__((const)) with no arguments,
except for property 4. Indeed, in all examples so far, putting the
necessary expression into such an (inline) function seems to work.
Still, GCC loves moving code and I’d rather not depend on it not doing
it, so... what’s the right way to express this?

My current attempt is to use an explicit compiler barrier: replace

#define have_feature (get_have_feature())

(with get_have_feature being a [[gnu::const]] function) by

#define have_feature (get_have_feature() && \
(__atomic_signal_fence(__ATOMIC_ACQUIRE), true))

But I have no idea if this is correct (and no way to check, since I’m
unable to get GCC to perform the offending code movement in any case).

Full mockup code (compiles itself when run as a shell script):

```
#if 0
gcc -std=c2x -DMAIN -g -O2 -c -o "${0%.c}.main.o" "$0" &&
gcc -std=c2x -DLOAD -g -O2 -c -o "${0%.c}.load.o" "$0" &&
gcc -g -O2 -o "${0%.c}" "${0%.c}.main.o" "${0%.c}.load.o" &&
objdump --disassemble=main "${0%.c}"; exit $?
#endif

#include <stdbool.h>
#include <stdio.h>

enum { LOADED = 1, PRESENT = 2 };
extern unsigned feature_state;
extern void load_feature(void);
static inline bool [[gnu::const]] get_have_feature(void) {
if (!__builtin_expect(!!(feature_state & LOADED), true))
load_feature();
return !!(feature_state & PRESENT);
}
#define have_feature (get_have_feature())

extern void (*pfunc1)(void), (*pfunc2)(void);
#define func1 (*pfunc1)
#define func2 (*pfunc2)

#ifdef MAIN

int main(void) {
if (have_feature)
func1();
puts("middle");
if (have_feature)
func2();
return 0;
}

#endif

#ifdef LOAD

static void func1impl(void) { puts("start"); }
static void func2impl(void) { puts("end"); }

unsigned feature_state /* = 0 */;
void (*pfunc1)(void), (*pfunc2)(void);

void load_feature(void) {
/* imagine some dlopen() / dlsym() business here ... */
pfunc1 = &func1impl; pfunc2 = &func2impl;
feature_state = LOADED | PRESENT;
}

#endif
```

Notes on solving the problem in a different way:

 * Can I redefine `func1` and `func2` to do something else?  Yes.
 * Can I redefine `func1` and `func2` to load the function pointers by
   themselves?  Yes in theory, but I’d rather not (performance).
 * Can I redefine `func1` and `func2` to include the feature check,
   something like `(have_feature ? *func1 : __builtin_unreachable())`?
   No, a single function can be provided by several independent
   features (e.g. glDebugMessageControl is provided both by
   GL_KHR_debug and by GL_VERSION_4_3), I just didn’t include this in
   the mockup for brevity.

[1]: https://www.khronos.org/opengl/wiki/OpenGL_Loading_Library

-- 
Thanks,
Alex

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux