OpenMP support

OpenMP directives and clauses

The following section shows supported OpenMP directives and the support status of their clauses.

Note

✅ = supported; ❌ = unsupported; 🔶 = partial support

barrier

No clauses.

critical

No clauses.

for

  • ✅ collapse, firstprivate, lastprivate, private

  • 🔶 reduction

  • ❌ allocate, linear, nowait, order, ordered, schedule

parallel

  • ✅ default, firstprivate, if, num_threads, private, shared

  • 🔶 reduction

  • ❌ allocate, copyin, proc_bind

parallel for

Combines parallel and for directives. See clauses for for and parallel above.

single

  • ❌ allocate, copyprivate, firstprivate, nowait, private

task

  • ✅ default, firstprivate, private, shared

  • ❌ affinity, allocate, detach, if, in_reduction, final, mergeable, priority, untied

taskwait

  • ❌ depend, nowait

target

  • ✅ device, firstprivate, map, private, thread_limit

  • ❌ allocate, defaultmap, depend, has_device_addr, if, in_reduction, is_device_ptr, nowait, uses_allocators

teams

  • ✅ default, firstprivate, num_teams, private, shared, thread_limit

  • 🔶 reduction

distribute

  • ✅ firstprivate, lastprivate, private

  • ❌ allocate, collapse, dist_schedule, order

teams distribute

Combines teams and distribute directives. See clauses for teams and distribute above.

target teams

Combines target and teams directives. See clauses for target and teams above.

target data

  • ✅ device, map

  • ❌ if, use_device_ptr, use_device_addr

target enter data

  • ✅ device, map

  • ❌ depend, if, nowait

target exit data

Same clauses as target enter data. See above.

target update

  • ✅ device, from, to

  • ❌ depend, if, nowait

target teams distribute

Combines target, teams, and distribute directives. See clauses for target, teams, and distribute above.

distribute parallel for

Combines distribute and parallel for directives. See clauses for distribute, parallel, and for above.

target teams distribute parallel for

Combines target, teams, distribute, and parallel for directives. See clauses for target, teams, parallel, and for above.

OpenMP runtime functions

Thread and team information

omp_get_thread_num()

Returns the unique identifier of the calling thread

omp_get_num_threads()

Returns the total number of threads in the current parallel region

omp_set_num_threads(n)

Sets the number of threads for subsequent parallel regions

omp_get_max_threads()

Returns the maximum number of threads available

omp_get_num_procs()

Returns the number of processors in the system

omp_get_thread_limit()

Returns the thread limit for the parallel region

omp_in_parallel()

Returns 1 if called within a parallel region, 0 otherwise

omp_get_team_num()

Returns the team number in a target region

omp_get_num_teams()

Returns the number of teams in a target region

Timing

omp_get_wtime()

Returns elapsed wall-clock time (useful for performance profiling)

Nested and hierarchical parallelism

omp_set_nested(flag)

Enables or disables nested parallelism

omp_set_dynamic(flag)

Enables or disables dynamic thread adjustment

omp_set_max_active_levels(n)

Sets the maximum number of nested parallel levels

omp_get_max_active_levels()

Returns the maximum number of nested parallel levels

omp_get_level()

Returns the current nesting level

omp_get_active_level()

Returns the current active nesting level

omp_get_ancestor_thread_num(level)

Returns the thread number at a given nesting level

omp_get_team_size(level)

Returns the team size at a given nesting level

omp_get_supported_active_levels()

Returns the supported number of nested active levels

Advanced features

omp_get_proc_bind()

Returns the processor binding policy

omp_get_num_places()

Returns the number of available places

omp_get_place_num_procs(place)

Returns the number of processors in a place

omp_get_place_num()

Returns the current place number

omp_in_final()

Returns 1 if called in a final task, 0 otherwise

Device and target offloading

omp_get_num_devices()

Returns the number of available target devices

omp_get_device_num()

Returns the device number of the current target device

omp_set_default_device(device_id)

Sets the default device for subsequent target regions

omp_get_default_device()

Returns the default device ID for target regions

omp_is_initial_device()

Returns 1 if executing on the initial device (host), 0 otherwise

omp_get_initial_device()

Returns the device ID of the initial device (host)

Supported features and platforms

OpenMP and GPU offloading support

PyOMP builds on Numba Just-In-Time (JIT) compilation extensions and leverages LLVM’s OpenMP implementation to provide portable parallel execution. The supported OpenMP features depend on your versions of LLVM and Numba. For compatibility details, see the Numba support info in the Numba documentation.

PyOMP also supports GPU offloading for NVIDIA GPUs. The supported GPU architectures depend on the LLVM version and its OpenMP runtime. Consult the LLVM OpenMP documentation for details on your specific version.

Device selection and querying

PyOMP provides utilities in the offloading module to query available OpenMP target devices and select specific devices for offloading based on device type, vendor, and architecture. This enables fine-grained control over where target regions execute.

Discovering Available Devices

To see all available devices and their properties, use print_offloading_info():

from numba.openmp.offloading import print_offloading_info

print_offloading_info()

This prints information about all devices, including device counts and default device settings.

Finding devices by criteria

To programmatically find device IDs matching specific criteria, use find_device_ids():

from numba.openmp.offloading import find_device_ids

# Find all GPU devices
gpu_devices = find_device_ids(type="gpu")

# Find all NVIDIA GPUs
nvidia_gpus = find_device_ids(vendor="nvidia")

# Find NVIDIA GPUs with specific architecture (e.g., sm_80)
sm80_gpus = find_device_ids(vendor="nvidia", arch="sm_80")

# Find all AMD GPUs
amd_gpus = find_device_ids(vendor="amd")

# Find host/CPU device
host_devices = find_device_ids(type="host")

The function returns a list of device IDs (integers) matching the criteria. Any parameter can be None to act as a wildcard and match all values.

Querying device properties

To determine the type, vendor, or architecture of a specific device ID, use the property getter functions:

from numba.openmp.offloading import (
    get_device_type,
    get_device_vendor,
    get_device_arch,
)

# Check device type
dev_type = get_device_type(device_id)  # Returns "gpu", "host", or None

# Check vendor
vendor = get_device_vendor(device_id)  # Returns "nvidia", "amd", "host", or None

# Check architecture
arch = get_device_arch(device_id)  # Returns architecture string or None

Using device ids in target regions

Once you have identified a device ID, you can use it in OpenMP target directives via the device clause:

from numba.openmp import njit, openmp_context as openmp
from numba.openmp.offloading import find_device_ids
import numpy as np

# Find first available NVIDIA GPU
nvidia_devices = find_device_ids(vendor="nvidia")
if nvidia_devices:
    device_id = nvidia_devices[0]
else:
    # Fall back to host if no NVIDIA GPU found
    device_id = find_device_ids(type="host")[0]


@njit
def inc(x):
    with openmp(f"target loop device({device_id}) map(tofrom: x)"):
        # Computation runs on specified device
        for i in range(len(x)):
            x[i] = x[i] + 1

    return x


x = inc(np.ones(10))
print(f"Result on device {device_id}: {x}")

Version and platform support

The following table shows tested combinations of PyOMP, Numba, Python, LLVM, and supported platforms:

PyOMP

Numba

Python

LLVM

Supported Platforms

0.5.x

0.62.x - 0.63.x

3.10 - 3.14

20.x

linux-64, osx-arm64, linux-arm64

0.4.x

0.61.x

3.10 - 3.13

15.x

linux-64, osx-arm64, linux-arm64

0.3.x

0.57.x - 0.60.x

3.9 - 3.12

14.x

linux-64, osx-arm64, linux-arm64

OpenMP parallelism support by platform

Platform

CPU

NVIDIA GPU

AMD GPU

linux-64

✅ Supported

✅ Supported

🔶 Work in progress

linux-arm64

✅ Supported

✅ Supported

🔶 Work in progress

osx-arm64

✅ Supported

❌ Unsupported

❌ Unsupported

Platform details

  • linux-64: Linux x86_64 architecture

  • osx-arm64: macOS ARM64 (Apple Silicon)

  • linux-arm64: Linux ARM64 architecture

  • GPU offloading: Available on Linux platforms only (linux-64 and linux-arm64)

Notes

  • Python 3.14 free-threaded build (cp314t) is not supported with the current Numba/llvmlite version.

  • LLVM version 20.1.8 is used for the current PyOMP 0.5.x releases.

  • For GPU offloading support, NVIDIA GPU and NVIDIA driver are required on supported Linux platforms.

  • AMD GPU support is in active development.