Altering OpenMP emitted code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!  I'm reaching the point of exhaustion in trying to understand GCC code, so I need help.  I want to change the code that GCC emits when the source code has an OpenMP reduction clause.  


WHAT GCC DOES NOW

Suppose your source code looks like this, a minimal example:

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
        omp_set_num_threads(4);
        int x = 42;
#pragma omp parallel reduction(+:x)
        {
                x++;
        }       
        printf("x = %d\n", x);
        return EXIT_SUCCESS;
}       

GCC creates an external function, "main.omp_fn.0," for the OpenMP parallel block.  Within main.omp_fn.0, in order to represent the reduction clause, GCC uses a temporary stack variable (let's call it x_prime), initialized to 0, in place of the original x.  Near the end of main.omp_fn.0, it then adds the current value of x_prime to the original x, using an atomic instruction, such as the LOCK ADD instruction for x86.  Here's the assembly code for x86_64/Ubuntu Linux, with labels and some dot-directives removed:

main.omp_fn.0:
        pushq   %rbp
        movq    %rsp, %rbp
        movq    %rdi, -24(%rbp)
        movl    $0, -4(%rbp)
        addl    $1, -4(%rbp)
        movq    -24(%rbp), %rax
        movl    -4(%rbp), %edx
        lock addl       %edx, (%rax)
        leave
        ret


HOW I WOULD LIKE TO CHANGE GCC'S BEHAVIOR

I want to replace the LOCK ADD instruction with a call to my own function (let's say "omp_reduction").   I will need to pass to omp_reduction the following parameters:
-- An enumerator value dependent on the operator originally used in the reduction--here, say, "OP_PLUS" for the original + operator.
-- The address of (original) x
-- The address of x_prime
-- An enumerator value for the type of x and x_prime

So the signature of omp_reduction would be

void omp_reduction(enum op_type op, void * var, void * tmp, enum operand_type type);

And the call, if written in C, would look like this, if (say) x were a 32-bit integer:

omp_reduction(OP_PLUS, &x, &x_prime, INT32);


WHERE I AM NOW (LOST)

I think the atomic instruction at the end (e.g., LOCK ADD) is represented by the gimple_reduction_merge field of type gimple_seq in the tree_omp_clause structure defined in tree.h:

struct GTY(()) tree_omp_clause {
  /* (.. Other fields ...) */
  /* The gimplification of OMP_CLAUSE_REDUCTION_{INIT,MERGE} for omp-low's
     usage.  */
  gimple_seq gimple_reduction_init;
  gimple_seq gimple_reduction_merge;

  tree GTY ((length ("omp_clause_num_ops[OMP_CLAUSE_CODE ((tree)&%h)]"))) ops[1];
};

But I do not understand how GCC assigns or uses this field or how I can alter GCC's behavior WRT it.  I cannot seem to find the relevant source code in gcc/gcc.

I'd really appreciate help or guidance.  Thanks!

Amittai Aviram
PhD Student in Computer Science
Yale University
646 483 2639
amittai.aviram@xxxxxxxx
http://www.amittai.com




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux