Re: OpenMP 4.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/29/2013 09:40 PM, Tim Prince wrote:
On 8/29/2013 2:20 PM, Tobias Burnus wrote:
José Luis García Pallero wrote:
I don't know if this is the correct place for this question, but I
haven't found any mailing list on the GOMP webpage.
The OpenMP 4.0 specifications were launched ten days ago. This new
standard includes several interesting characteristics as SIMD and
accelerators directives and error handling facilities. Is planned to
add this new version of OpenMP to libgomp and, then, to GCC 4.9?

Well, it takes a while until features are implemented - and the implementation work can only start after a specification/standard is sufficiently finished to not change in a major way.

Having said that, there is a GCC branch called gomp-4_0-branch (see
http://gcc.gnu.org/svn.html), which is used for the on-going implementation. I think SIMD already partially works (with C/C++, not yet with Fortran).

I believe that it is planed to support OpenMP 4 in GCC 4.9.


Tim Prince wrote
OpenMP 4.0 simd facilities are related to Cilk(tm) Plus pragmas, for which there is a gcc branch on git (although I haven't figured out that stuff).

As far as I gathered, Cilk+'s pragmas and OpenMP'pragmas are supposed to be handled identically. (I think there were some differences but they got resolved by changing Cilk+.) There are some Cilk+ branches, which aim at consolidating the effort with OpenMP. Actually, some Cilk+ patches has been submitted for inclusion - thus, expect more for this. (The submitted patches do not include SIMD as far as I know. The branches do support it.)

There are distinctions in Intel compilers between Cilk(tm) Plus and OpenMP 4.0. For example, Cilk(tm) Plus expects use of simd firstprivate lastprivate where appropriate, while OpenMP 4.0 doesn't support those clauses, and depends on the compiler recognizing those cases of omp simd private. Intel once talked of reconciling terminology (it seems unsatisfactory to market Fortran directives as Cilk(tm) Plus). Intel takes Cilk(tm) Plus simd to require in-line simd instructions rather than automatic replacement by the special memset/memcpy library function calls, while the corresponding omp simd construct doesn't inhibit those automatic replacements. I guess gfortran et al. aren't so likely to introduce these substitutions, so don't need a means to control them.


For example, I know of no one planning to implement user defined reduction. Some talk about proposing a specific standard on indexed min/max before deciding about user defined reductions.

I think the gomp-4_0-branch already supports min/max since quite some time. (For C/C++; Fortran supports it already since older OpenMP specs.) Additionally, I believe that Jakub intents to implement user-defined reductions (UDR) and that he has already done some prep work on the branch. Ignoring "omp target", UDR seems to be the biggest new feature.
C omp parallel reduction(min|max: ) was introduced in OpenMP 3.1 but I didn't find any tests for it in the gcc 4.9 testsuite. Corresponding omp simd reduction would not be so important for C++ if g++ could optimize min/max with maxp[sd]/minp[sd] as gfortran and icpc do. No omp max|min reductions are likely in the Intel icc/icpc 14.0 releases in a week or so, regardless of claims to support OpenMP 4.0.

(Regarding "omp target" and other accelerator/GPU/hybrid-system support: I think there is quite some interest to get it working with GCC, however, it probably will take until 4.10 or longer.)


Among my ulterior motives for asking is my attempt to write a book centered on HPC development topics.

That sounds interesting!


Tobias

PS: Regarding SIMD, in GCC 4.9 itself, some basic support has already been merged a few days ago. However, it is not yet accessible from user code (no front-end support) and I have the impression the information is not yet used for optimization. But expect soon some support (possibly something like #pragma simd, #pragma vector for C/C++ and usage for DO CONCURRENT in Fortran) - but I don't know which pragma and when the support will be added.
DO CONCURRENT needs a more satisfactory way to invoke omp parallel. A limited facility (beyond current auto-parallelization) would not appear in ifort until next year. It seems too difficult to cover all possibilities. Auto-vectorization works well already (in gfortran, for example).


I grabbed the gomp-4_0-branch, had to set --disable-werror to build it. I found 2 cases in netlib vectors benchmark where #pragma omp simd brings gcc performance up to at least match icc. I suppose it could do the same for gfortran when the omp simd directives become available.
Tim

--
Tim Prince





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux