Hello,
for builtin types, the compiler can optimize some functions when it
detects that an argument is a compile-time constant. For instance it can
turn a multiplication (or division) of an unsigned int by a power of 2
into a shift.
If I create my own type (I am doing C++, but it would be similar in C), I
lose all these optimizations.
Gcc provides __builtin_constant_p which in many cases (assuming at least
-O1) can detect compile-time constants. Knowing it, I would like to
perform some computation at compile-time that will make the evaluation
fast. For instance, if I detect a multiplication by a power of 2, I need
to compute the log2 of this number to pass it as an argument to shift. And
I really want the computation of the log2 to happen at compile-time,
otherwise this branch becomes slower than the regular multiplication. I
can compute the log2 and ask gcc whether the result is a compile-time
constant, so I am safe.
But in practice, the operations that carry constants are fairly limited.
In order to compile the log2, I have to use a switch statement with a list
of all possible powers of 2, because a for/while loop gives a result that
is not considered a constant (for log2 in particular, probably as a result
of the gmp-mpfr integration, (int)(log2(5)) is considered a constant by
the latest gcc version, but that is really a very special case). If I do
modular arithmetic and want to compute the inverse of a number to replace
a division by a multiplication, it essentially means computing a gcd and I
can't do it without a loop or recursion (or do I really have to unroll it
by hand to a large depth?).
Is there a way to tell gcc that it would be _really_ nice if it could
evaluate some function at compile-time, even if that means, as in the case
of a log2 computation, going through a loop up to 64 times?
Note that I could have a type static_int<n> and use C++ template
meta-programming to perform all the compile-time computations I want (log2
and gcd are trivial, sqrt is already a bit more fun), but it means I have
to write n/static_int<2>() instead of n/2 to halve a number, and I want to
avoid that. Besides, there are other overloading issues with this
technique.
Thank you for any hint,
--
Marc Glisse