-------- CONTEXT: -------- I have completely changed the gegl/buffer/gegl-sampler-yafr.c code. Before I put together the patch to the new yafr, I would like to see if I could make the code even faster by using C99/gcc built-in math intrinsics. I have not tried this yet. The method used by the updated code is different from the first generation yafr (at once softer and more pervasive; yes I know this is vague: what I mean is that the nonlinear correction is "on" throughout more of the image, but that its effect is never as extreme). The code also runs even faster than before: on my current vintage laptop, yafr scales up about 10% slower than gegl-sampler-linear, and about 10% faster than gegl-sampler-cubic. Regarding further speed-up: Using abs, fmin and copysign I could make my code branch-free (assuming of course that these operations are translated to assembly built-ins by the compiler on the machine on which the code is compiled). That is: the code, which right now contains no "if," no "for," no "do" and no "while," would now contain no ?. I suspect that using arithmetic branching could make my code run noticeably faster. --------- QUESTION: --------- I noticed that fabs, fmin and copysign, or similar C99/gcc built-ins, are not found anywhere in the gegl source. Is there a preferred/tolerated way of using such math functions in gegl? Can I assume that gfloats are floats? Can I assume that gdoubles are doubles? Must I program with the possibility that gfloats be doubles? Must I program with the possibility that gdoubles be floats? Could gfloats or gdoubles be anything else than floats or doubles? Some ideas: Idea 0: It may be that I can use the type-generic fabs, fmin and copysign on gfloats without a speed hit. Hopefully, gcc can use the correct one based on the fact that it acts on gfloats. If not, it may be that using the double versions on gfloats is still faster than the alternatives. Idea 1: If I KNEW for a fact that gfloat = float, I could simply use fabsf, fminf, and copysignf. Idea 2: I could do the necessary parts of the computation with doubles (or gdoubles) and then use the double versions. Hopefully, this will not slow down gegl when run on hardware which is faster on floats than doubles (like some GPUs). Idea 3: Is there a smarter way, which picks the right one? Idea 4: Change compilation flags to include C99 built-ins? Idea 5: You have another idea? Idea 6: Or should I just stick to C90 gcc built-ins? Nicolas Robidoux Laurentian University/Universite Laurentienne _______________________________________________ Gimp-developer mailing list Gimp-developer@xxxxxxxxxxxxxxxxxxxxxx https://lists.XCF.Berkeley.EDU/mailman/listinfo/gimp-developer