Installation

GPUFlight supports both NVIDIA CUDA and AMD ROCm backends. You need the C++ library for integration into your GPU application, and optionally the Python library for analysis and visualization.

C++ Library (Integration)

The recommended way to integrate gpufl into your C++ project is via CMake's FetchContent.

NVIDIA Prerequisites

CMake 3.20 or higher
CUDA Toolkit (including CUPTI)
A C++17 compatible compiler

AMD Prerequisites

CMake 3.28 or higher
ROCm 6.x with HIP runtime
ROCm SMI library
rocprofiler-sdk
A C++17 compatible compiler

CMake Integration

Add the following to your CMakeLists.txt:

include(FetchContent)

FetchContent_Declare(
    gpufl
    GIT_REPOSITORY https://github.com/gpu-flight/gpufl-client.git
    GIT_TAG        main
)

FetchContent_MakeAvailable(gpufl)

For NVIDIA targets:

target_link_libraries(my_app PRIVATE gpufl::gpufl CUDA::cudart)

For AMD/HIP targets:

# Enable AMD backend
set(GPUFL_ENABLE_AMD ON CACHE BOOL "" FORCE)
set(GPUFL_ENABLE_NVIDIA OFF CACHE BOOL "" FORCE)

target_link_libraries(my_app PRIVATE gpufl::gpufl hip::host)

Build Options

Option	Default	Description
`GPUFL_ENABLE_NVIDIA`	`ON`	Enable NVIDIA backends (CUDA + NVML)
`GPUFL_ENABLE_AMD`	`OFF`	Enable AMD backends (ROCm + HIP)
`BUILD_TESTING`	`ON`	Build test suite
`BUILD_PYTHON`	`OFF`	Build Python bindings

Python Library (Analysis)

The Python library provides tools for analyzing, reporting, and visualizing the logs generated by the C++ library.

Basic Installation

pip install gpufl

Full Installation (with Visualization)

pip install "gpufl[numba,viz,analyzer]"

The Python library works with logs from both NVIDIA and AMD sessions — no backend-specific installation is needed for analysis.

C++ Library (Integration)​

NVIDIA Prerequisites​

AMD Prerequisites​

CMake Integration​

Build Options​

Python Library (Analysis)​

Basic Installation​

Full Installation (with Visualization)​