📄️ NVIDIA / CUDA Integration
GPUFlight supports NVIDIA GPUs via CUDA, providing kernel interception through CUPTI, system telemetry via NVML, multiple profiling engines, SASS disassembly, and source-to-assembly correlation.
📄️ AMD / ROCm Integration
GPUFlight supports AMD GPUs via ROCm, including HIP kernel tracing, system telemetry, occupancy analysis, and ISA disassembly.
📄️ Report Generation
GPUFlight can generate a text-based performance report after a profiling session. The report summarizes kernel execution, memory transfers, system metrics, scope timing, and profile analysis.
📄️ ISA Disassembly
GPUFlight automatically captures and disassembles GPU code objects, providing instruction-level visibility into your kernels.
📄️ C++ Integration Guide
This guide covers how to use GPUFlight in your CUDA or HIP C++ application.
📄️ Python Analysis & Visualization
The gpufl Python library provides tools for analyzing, reporting, and visualizing the structured logs (NDJSON) produced by the C++ library. It works with logs from both NVIDIA and AMD sessions.
📄️ Testing
The GPUFlight project includes comprehensive test suites for both the C++ and Python components.