Monitor NVIDIA GPUs using OpenTelemetry

OpenLIT uses OpenTelemetry to help you monitor NVIDIA GPUs. This includes tracking GPU metrics like utilization, temperature, memory usage and power consumption.

Get started

Using the SDK

Collect and send GPU performance metrics directly from your application to an OpenTelemetry endpoint.

Using the Collector

Install the OpenTelemetry GPU Collector as a Docker container to collect and send GPU performance metrics to an OpenTelemetry endpoint.

Using the SDK

Install OpenLIT

Open your command line or terminal and run:

pip install openlit

Initialize OpenLIT in your Application

Python
Typescript

Zero Code Instrumentation
One-Line Instrumentation

Perfect for existing applications - no code modifications needed:

Via CLI Arguments
Via Environment Variables

# Configure via CLI arguments
openlit-instrument \
  --service-name my-ai-app \
  --environment production \
  --otlp-endpoint YOUR_OTEL_ENDPOINT \
  python your_app.py

Perfect for: Legacy applications, production systems where code changes need approval, quick testing, or when you want to add observability without touching existing code.

Replace: YOUR_OTEL_ENDPOINT with the URL of your OpenTelemetry backend, such as http://127.0.0.1:4318 if you are using OpenLIT and a local OTel Collector.To send metrics and traces to other Observability tools, refer to the supported destinations.For more advanced configurations and application use cases, visit the OpenLIT Python repository or OpenLIT Typescript repository.

Using the Collector

Pull `otel-gpu-collector` Docker Image

You can quickly start using the OTel GPU Collector by pulling the Docker image:

docker pull ghcr.io/openlit/otel-gpu-collector:latest

Run `otel-gpu-collector` Docker container

You can quickly start using the OTel GPU Collector by pulling the Docker image: Here’s a quick example showing how to run the container with the required environment variables:

docker run --gpus all \
    -e GPU_APPLICATION_NAME='chatbot' \
    -e GPU_ENVIRONMENT='staging' \
    -e OTEL_EXPORTER_OTLP_ENDPOINT="YOUR_OTEL_ENDPOINT" \
    -e OTEL_EXPORTER_OTLP_HEADERS="YOUR_OTEL_HEADERS" \
    ghcr.io/openlit/otel-gpu-collector:latest

For more advanced configurations of the collector, visit the OTel GPU Collector repository.Note: If you’ve deployed OpenLIT using Docker Compose, make sure to use the host’s IP address or add OTel GPU Collector to the Docker Compose:

Docker Compose: Add the following config under `services`

otel-gpu-collector:
  image: ghcr.io/openlit/otel-gpu-collector:latest
  environment:
    GPU_APPLICATION_NAME: 'chatbot'
    GPU_ENVIRONMENT: 'staging'
    OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector:4318"
  device_requests:
  - driver: nvidia
    count: all
    capabilities: [gpu]
  depends_on:
  - otel-collector
  restart: always

Host IP: Use the Host IP to connect to OTel Collector

OTEL_EXPORTER_OTLP_ENDPOINT="http://192.168.10.15:4318"

Environment Variables

OTel GPU Collector supports several environment variables for configuration. Below is a table that describes each variable:

Environment Variable	Description	Default Value
`GPU_APPLICATION_NAME`	Name of the application running on the GPU	`default_app`
`GPU_ENVIRONMENT`	Environment name (e.g., staging, production)	`production`
`OTEL_EXPORTER_OTLP_ENDPOINT`	OpenTelemetry OTLP endpoint URL	(required)
`OTEL_EXPORTER_OTLP_HEADERS`	Headers for authenticating with the OTLP endpoint	Ignore if using OpenLIT

Collected Metrics

Details on the types of metrics collected and their descriptions.

Quickstart: LLM Observability

Production-ready AI monitoring setup in 2 simple steps with zero code changes

Configuration

Configure the OpenLIT SDK according to you requirements.

Destinations

Send telemetry to Datadog, Grafana, New Relic, and other observability stacks

Running in Kubernetes? Try the OpenLIT Operator

Automatically inject instrumentation into existing workloads without modifying pod specs, container images, or application code.

Getting Started

Core Features

Integrations

Destinations

Monitor NVIDIA GPUs using OpenTelemetry

Get started

Using the SDK

Using the Collector

Environment Variables

Collected Metrics

Quickstart: LLM Observability

Configuration

Destinations

Running in Kubernetes? Try the OpenLIT Operator

Getting Started

Core Features

Integrations

Destinations

​Get started

Using the SDK

Using the Collector

​Environment Variables

Collected Metrics

Quickstart: LLM Observability

Configuration

Destinations

Running in Kubernetes? Try the OpenLIT Operator

Get started

Environment Variables