Intel Extension Routines

The Intel® Fortran Compiler implements the following group of routines as an extension to the OpenMP run-time library: getting and setting stack size for parallel threads and memory allocation.

The Intel extension routines described in this section can be used for low-level debugging to verify that the library code and application are functioning as intended. It is recommended to use these routines with caution because using them requires the use of the -openmp_stubs command-line option to execute the program sequentially. These routines are also generally not recognized by other vendor's OpenMP-compliant compilers, which may cause the link stage to fail for these other compilers.

Stack Size

In most cases, environment variables can be used in place of the extension library routines. For example, the stack size of the parallel threads may be set using the KMP_STACKSIZE environment variable rather than the kmp_set_stacksize() library routine.

Note
A run-time call to an Intel extension routine takes precedence over the corresponding environment variable setting.

The routines kmp_set_stacksize() and kmp_get_stacksize() take a 32-bit argument only. The routines kmp_set_stacksize_s() and kmp_get_stacksize_s() take a size_t argument, which can hold 64-bit integers.

On Itanium-based systems, it is recommended to always use kmp_set_stacksize() and kmp_get_stacksize(). These _s() variants must be used if you need to set a stack size ≥ 2**32 bytes (4 gigabytes).

See the definitions of stack size routines in the table that follows.

Memory Allocation

The Intel® Fortran Compiler implements a group of memory allocation routines as an extension to the OpenMP* run-time library to enable threads to allocate memory from a heap local to each thread. These routines are: kmp_malloc, kmp_calloc, and kmp_realloc.

The memory allocated by these routines must also be freed by the kmp_free routine. While it is legal for the memory to be allocated by one thread and kmp_free'd by a different thread, this mode of operation has a slight performance penalty.

See the definitions of these routines in the table that follows.

Function/Routine

Description

Stack Size

function kmp_get_stacksize_s()
integer(kind=kmp_size_t_kind)kmp_get
_stacksize_s

Returns the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can be changed via the kmp_get_stacksize_s routine, prior to the first parallel region or via the KMP_STACKSIZE environment variable.

function kmp_get_stacksize()
integer kmp_get_stacksize

This routine is provided for backwards compatibility only; use kmp_get_stacksize_s  routine for compatibility across different families of Intel processors.

subroutine kmp_set_stacksize_s(size)
integer (kind=kmp_size_t_kind)
size

Sets to size the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can also be set via the KMP_STACKSIZE environment variable. In order for kmp_set_stacksize_s to have an effect, it must be called before the beginning of the first (dynamically executed) parallel region in the program.

subroutine kmp_set_stacksize(size)
integer
size

This routine is provided for backward compatibility only; use kmp_set_stacksize_s(size) for compatibility across different families of Intel processors.

Memory Allocation

function kmp_malloc(size)
integer(kind=kmp_pointer_kind)kmp_malloc
integer(kind=kmp_size_t_kind)
size

Allocate memory block of size bytes from thread-local heap.

function kmp_calloc(nelem,elsize)
integer(kind=kmp_pointer_kind)kmp_calloc
integer(kind=kmp_size_t_kind)
nelem
integer(kind=kmp_size_t_kind)
elsize

Allocate array of nelem elements of size elsize from thread-local heap.

function kmp_realloc(ptr, size)
integer(kind=kmp_pointer_kind)kmp_realloc
integer(kind=kmp_pointer_kind)
ptr
integer(kind=kmp_size_t_kind)
size

Reallocate memory block at address ptr and size bytes from thread-local heap.

subroutine kmp_free(ptr)
integer (kind=kmp_pointer_kind)
ptr

Free memory block at address ptr from thread-local heap.  Memory must have been previously allocated with
kmp_malloc
, kmp_calloc, or kmp_realloc.