Skip to main content
The collector monitors Intel GPUs via the Linux kernel’s sysfs, hwmon, and DRM interfaces exposed by the i915 and Xe drivers. No Intel GPU tools, no OneAPI, and no user-space libraries are required.
Intel GPU support provides thermal, power, energy, and clock metrics. Utilization and memory metrics are not available via the sysfs/hwmon interface — these would require the Intel XPU Manager or similar tooling.

Requirements

  • Linux with the i915 or xe kernel driver
  • Kernel 5.10+ for sysfs metric exposure
  • Kernel 6.16+ for fan speed (fan1_input)

Collected metrics

MetricSourceRequirement
hw.gpu.temperaturehwmon temp1_inputkernel 5.10+
hw.gpu.power.drawhwmon power1_averagekernel 5.10+
hw.gpu.power.limithwmon power1_maxkernel 5.10+
hw.gpu.energy.consumedhwmon energy1_inputkernel 5.10+
hw.gpu.clock.graphicsDRM gt_cur_freq_mhzXe driver
hw.gpu.fan_speedhwmon fan1_inputkernel 6.16+

Docker

docker run -d \
  --name otel-gpu-collector \
  --device /dev/dri:/dev/dri \
  -e OTEL_SERVICE_NAME=my-app \
  -e OTEL_RESOURCE_ATTRIBUTES='deployment.environment=production' \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318 \
  ghcr.io/openlit/otel-gpu-collector:latest

Docker Compose

services:
  otel-gpu-collector:
    image: ghcr.io/openlit/otel-gpu-collector:latest
    environment:
      OTEL_SERVICE_NAME: my-app
      OTEL_RESOURCE_ATTRIBUTES: deployment.environment=production
      OTEL_EXPORTER_OTLP_ENDPOINT: http://otel-collector:4318
    devices:
      - /dev/dri:/dev/dri
    restart: always

Kubernetes (DaemonSet)

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-gpu-collector
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: otel-gpu-collector
  template:
    metadata:
      labels:
        app: otel-gpu-collector
    spec:
      containers:
        - name: collector
          image: ghcr.io/openlit/otel-gpu-collector:latest
          env:
            - name: OTEL_SERVICE_NAME
              value: gpu-collector
            - name: OTEL_RESOURCE_ATTRIBUTES
              value: deployment.environment=production
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: http://otel-collector.monitoring.svc.cluster.local:4318
          securityContext:
            privileged: false
          volumeMounts:
            - name: sys
              mountPath: /sys
              readOnly: true
      volumes:
        - name: sys
          hostPath:
            path: /sys

Metrics reference

Full metrics list with types, units, and attributes

Configuration

All environment variables and defaults