The Intel® Fortran Compiler supports the OpenMP* Fortran version 2.0 API specification, except for the WORKSHARE directive. OpenMP provides symmetric multiprocessing (SMP) with the following major features:
Relieves the user from having to deal with the low-level details of iteration space partitioning, data sharing, and thread scheduling and synchronization.
Provides the benefit of the performance available from shared memory, multiprocessor systems; and, for IA-32 systems, from Hyper-Threading Technology-enabled systems (for Hyper-Threading Technology, refer to the IA-32 Intel® Architecture Optimization Reference Manual).
The Intel Fortran Compiler performs transformations to generate multithreaded code based on the user's placement of OpenMP directives in the source program making it easy to add threading to existing software. The Intel compiler supports all of the current industry-standard OpenMP directives, except WORKSHARE , and compiles parallel programs annotated with OpenMP directives.
In addition, the Intel Fortran Compiler provides Intel-specific extensions to the OpenMP Fortran version 2.0 specification including run-time library routines and environment variables.
See parallelization options summary for all options of the OpenMP feature in the Intel Fortran Compiler. For complete information on the OpenMP standard, visit the www.openmp.org web site. For complete Fortran language specifications, see the OpenMP Fortran version 2.0 specifications.
To compile with OpenMP, you need to prepare your program by annotating the code with OpenMP directives in the form of the Fortran program comments. The Intel Fortran Compiler first processes the application and produces a multithreaded version of the code which is then compiled. The output is a Fortran executable with the parallelism implemented by threads that execute parallel regions or constructs. See Programming with OpenMP.
For performance analysis of your program, you can use the VTune(TM) analyzer and/or the Intel® Threading Tools to show performance information. You can obtain detailed information about which portions of the code that require the largest amount of time to execute and where parallel performance problems are located.