Follow these source coding guidelines to improve run-time performance. The amount of improvement in run-time performance is related to the number of times a statement is executed. For example, improving an arithmetic expression executed within a loop many times has the potential to improve performance, more than improving a similar expression executed once outside a loop.

Avoid using integer or logical data less than 32 bits. Accessing a 16-bit (or 8-bit) data type can make data access less efficient, especially on Itanium-based systems.

To minimize data storage and memory cache misses with arrays, use 32-bit data rather than 64-bit data, unless you require the greater numeric range of 8-byte integers or the greater range and precision of double precision floating-point numbers.

Avoid mixing integer and floating-point (REAL) data in the same computation. Expressing all numbers in a floating-point arithmetic expression (assignment statement) as floating-point values eliminates the need to convert data between fixed and floating-point formats. Expressing all numbers in an integer arithmetic expression as integer values also achieves this. This improves run-time performance.

For example, assuming that I and J are both INTEGER variables, expressing a constant number (2.) as an integer value (2) eliminates the need to convert the data:

Inefficient Code:

INTEGER I, J

I = J / 2.

Efficient Code:

INTEGER I, J

I = J / 2

You can use different sizes of the same general data type in an expression with minimal or no effect on run-time performance. For example, using REAL, DOUBLE PRECISION, and COMPLEX floating-point numbers in the same floating-point arithmetic expression has minimal or no effect on run-time performance.

In cases where more than one data type can be used for a variable, consider selecting the data types based on the following hierarchy, listed from most to least efficient:

Integer (also see above example)

Single-precision real, expressed explicitly as REAL, REAL (KIND=4), or REAL*4

Double-precision real, expressed explicitly as DOUBLE PRECISION, REAL (KIND=8), or REAL*8

Extended-precision real, expressed explicitly as REAL (KIND=16) or REAL*16

However, keep in mind that in an arithmetic expression, you should avoid mixing integer and floating-point (REAL) data (see example in the previous subsection).

Before you modify source code to avoid slow arithmetic operators, be aware that optimizations convert many slow arithmetic operators to faster arithmetic operators. For example, the compiler optimizes the expression H=J**2 to be H=J*J.

Consider also whether replacing a slow arithmetic operator with a faster arithmetic operator will change the accuracy of the results or impact the maintainability (readability) of the source code.

Replacing slow arithmetic operators with faster ones should be reserved for critical code areas. The following hierarchy lists the Intel Fortran arithmetic operators, from fastest to slowest:

Addition (+), subtraction (-), and floating-point multiplication (*)

Integer multiplication (*)

Division (/)

Exponentiation (**)

Avoid using EQUIVALENCE statements. EQUIVALENCE statements can:

Force unaligned data or cause data to span natural boundaries.

Prevent certain optimizations, including:

Global data analysis under certain conditions (see -O2 in Setting Optimization with -On options).

Implied-DO loop collapsing when the control variable is contained in an EQUIVALENCE statement

Whenever the Intel Fortran compiler has access to the use and definition of a subprogram during compilation, it may choose to inline the subprogram. Using statement functions and internal subprograms maximizes the number of subprogram references that will be inlined, especially when multiple source files are compiled together at optimization level -O3.

For more information, see Efficient Compilation.

Minimize the arithmetic operations and other operations in a DO loop whenever possible. Moving unnecessary operations outside the loop will improve performance (for example, when the intermediate nonvarying values within the loop are not needed).

For more information on loop optimizations, see Pipelining
for Itanium®-based
Applications and Loop Unrolling; on the syntax of Intel Fortran statements,
see the *Intel®
Fortran Language Reference*.