Skip to main content
The easiest way to run the collector. The image is published to GitHub Container Registry and supports linux/amd64 and linux/arm64.
docker pull ghcr.io/openlit/otel-gpu-collector:latest

Tags

TagDescription
latestMost recent release
1.2.3Specific version
1.2Latest patch of a minor version

Pre-built binaries

Download a binary for your platform from the GitHub Releases page. Binaries are available for:
PlatformArchitecture
Linuxamd64, arm64, armv7
macOSamd64 (Intel), arm64 (Apple Silicon)
Windowsamd64, arm64
# Example: Linux amd64
curl -L https://github.com/openlit/openlit/releases/latest/download/opentelemetry-gpu-collector-<version>-linux-amd64 \
    -o opentelemetry-gpu-collector
chmod +x opentelemetry-gpu-collector

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
./opentelemetry-gpu-collector
GPU metrics (NVML, sysfs) and eBPF tracing only work on Linux. On macOS and Windows the binary runs with host and process metrics only.
Verify the SHA256 checksum from the SHA256SUMS.txt file in the release:
sha256sum -c SHA256SUMS.txt --ignore-missing

Build from source

Requirements: Go 1.21+, CGO enabled (required for NVML on Linux).
git clone https://github.com/openlit/openlit.git
cd openlit/opentelemetry-gpu-collector
make build
./opentelemetry-gpu-collector
For eBPF CUDA tracing support, also run:
make setup-bpf   # installs bpftool, generates vmlinux.h
make generate    # runs bpf2go code generation
make build

Upgrade

Docker

docker pull ghcr.io/openlit/otel-gpu-collector:latest
docker stop otel-gpu-collector
docker rm otel-gpu-collector
# re-run with same flags

Binary

Download the new binary from the Releases page, replace the existing file, and restart the process.

Uninstall

Docker

docker stop otel-gpu-collector
docker rm otel-gpu-collector
docker rmi ghcr.io/openlit/otel-gpu-collector:latest

Binary

rm /usr/local/bin/opentelemetry-gpu-collector

Troubleshooting

  • Confirm the host has a supported GPU: lspci | grep -E 'VGA|3D|Display'
  • For NVIDIA: verify libnvidia-ml.so is present: ldconfig -p | grep nvidia-ml
  • For Docker: ensure --gpus all (NVIDIA) or --device /dev/dri (AMD/Intel) is passed
  • Check logs: docker logs otel-gpu-collector for "discovered GPU" entries
  • Confirm OTEL_GPU_EBPF_ENABLED=true is set
  • Check kernel version: uname -r (requires 5.8+)
  • The process needs CAP_BPF and CAP_PERFMON, or run as root
  • For Docker: add --privileged or --cap-add CAP_BPF --cap-add CAP_PERFMON
  • Verify libcudart.so is present on the host: ldconfig -p | grep cudart
  • Verify OTEL_EXPORTER_OTLP_ENDPOINT is reachable from the container: curl http://<endpoint>/health
  • For Docker networking: use the host IP or service name, not localhost
  • Check if gRPC vs HTTP/protobuf matches the backend: set OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf for HTTP backends (port 4318)
  • Verify the i915 or Xe driver is loaded: lsmod | grep -E 'i915|xe'
  • Check DRM entries exist: ls /sys/class/drm/
  • Requires Linux kernel 5.10+ for sysfs metric exposure
  • Fan speed requires kernel 6.16+