Documentation Index Fetch the complete documentation index at: https://docs.openlit.io/llms.txt
Use this file to discover all available pages before exploring further.
In this guide you’ll pull the collector Docker image, point it at your OTel backend, and start seeing GPU and host metrics within minutes.
Prerequisites
Linux host with NVIDIA, AMD, or Intel GPU (for GPU metrics)
Docker installed
An OpenTelemetry-compatible backend (OpenLIT, Grafana, Datadog, or any OTLP endpoint)
Don't have an OTel backend? Start OpenLIT locally
docker run -d \
--name openlit \
-p 3000:3000 \
-p 4318:4318 \
ghcr.io/openlit/openlit:latest
Then use http://localhost:4318 as your OTEL_EXPORTER_OTLP_ENDPOINT.
Pull the collector image docker pull ghcr.io/openlit/otel-gpu-collector:latest
Run the collector NVIDIA GPU
AMD GPU
Intel GPU
Host metrics only
docker run -d \
--name otel-gpu-collector \
--gpus all \
-e OTEL_SERVICE_NAME=my-app \
-e OTEL_RESOURCE_ATTRIBUTES='deployment.environment=production' \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
ghcr.io/openlit/otel-gpu-collector:latest
Requires the NVIDIA Container Toolkit on the host. docker run -d \
--name otel-gpu-collector \
--device /dev/kfd:/dev/kfd \
--device /dev/dri:/dev/dri \
-e OTEL_SERVICE_NAME=my-app \
-e OTEL_RESOURCE_ATTRIBUTES='deployment.environment=production' \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
ghcr.io/openlit/otel-gpu-collector:latest
docker run -d \
--name otel-gpu-collector \
--device /dev/dri:/dev/dri \
-e OTEL_SERVICE_NAME=my-app \
-e OTEL_RESOURCE_ATTRIBUTES='deployment.environment=production' \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
ghcr.io/openlit/otel-gpu-collector:latest
Requires Linux kernel 5.10+ with the i915 or Xe driver. docker run -d \
--name otel-gpu-collector \
-e OTEL_SERVICE_NAME=my-app \
-e OTEL_RESOURCE_ATTRIBUTES='deployment.environment=production' \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
ghcr.io/openlit/otel-gpu-collector:latest
The collector will export host and process metrics even without GPU access.
Verify it's running docker logs otel-gpu-collector
You should see output like: time=2024-01-01T00:00:00Z level=INFO msg="starting opentelemetry-gpu-collector"
time=2024-01-01T00:00:00Z level=INFO msg="discovered GPU" address=0000:01:00.0 vendor=nvidia
time=2024-01-01T00:00:00Z level=INFO msg="system metrics collector initialized"
time=2024-01-01T00:00:00Z level=INFO msg="process metrics collector initialized"
time=2024-01-01T00:00:00Z level=INFO msg="collector running"
View metrics in your backend Open your OTel backend and look for metrics in the hw.gpu.*, system.*, and process.* namespaces. If using OpenLIT, navigate to http://localhost:3000 and go to the Metrics section.
Docker Compose
Add the collector as a service alongside your existing stack:
services :
otel-gpu-collector :
image : ghcr.io/openlit/otel-gpu-collector:latest
environment :
OTEL_SERVICE_NAME : my-app
OTEL_RESOURCE_ATTRIBUTES : deployment.environment=production
OTEL_EXPORTER_OTLP_ENDPOINT : http://otel-collector:4318
deploy :
resources :
reservations :
devices :
- driver : nvidia
count : all
capabilities : [ gpu ]
depends_on :
- otel-collector
restart : always
Configuration Full reference for all environment variables and defaults
Metrics reference Complete list of all metrics, types, units, and attributes