GPU Performance Monitoring

OpenLIT uses OpenTelemetry to help you monitor NVIDIA and AMD GPUs for AI applications. Track GPU metrics like utilization, temperature, memory usage, and power consumption during AI training and inference workloads.

Choose your method

GPU monitoring can be implemented in two ways depending on your setup and requirements:

OpenLIT SDK

It is useful if you already have an AI application running on GPU that’s instrumented with OpenLIT.It extends your existing observability to include GPU metrics alongside your LLM traces.

OpenTelemetry GPU Collector

It is useful for remote GPUs with only LLM models hosted, containerized deployments.This approach allows you to get GPU metrics without modifying application code.

Supported Parameters

SDK Configuration Options

Parameter	Type	Default	Description
`collect_system_metrics`	boolean	`False`	Enable GPU and system metrics collection
`otlp_endpoint`	string	None	OpenTelemetry OTLP endpoint URL
`otlp_headers`	string	None	Authentication headers for OTLP endpoint
`service_name`	string	`"unknown_service"`	Name of your AI application
`environment`	string	None	Deployment environment (dev, staging, prod)

Environment Variables

Variable	Description	Example
`OPENLIT_COLLECT_SYSTEM_METRICS`	Enable GPU monitoring	`true`
`OTEL_EXPORTER_OTLP_ENDPOINT`	OTLP endpoint URL	`http://127.0.0.1:4318`
`OTEL_SERVICE_NAME`	Service name for telemetry	`my-gpu-app`
`OTEL_DEPLOYMENT_ENVIRONMENT`	Deployment environment	`production`

Deploy OpenLIT

Deployment options for scalable LLM monitoring infrastructure

Integrations

60+ AI integrations with automatic instrumentation and performance tracking

Destinations

Send telemetry to Datadog, Grafana, New Relic, and other observability stacks

Running in Kubernetes? Try the OpenLIT Operator

Automatically inject instrumentation into existing workloads without modifying pod specs, container images, or application code.

Getting Started

Core Features

Integrations

Destinations

GPU Performance Monitoring

Choose your method

OpenLIT SDK

OpenTelemetry GPU Collector

Supported Parameters

SDK Configuration Options

Environment Variables

Deploy OpenLIT

Integrations

Destinations

Running in Kubernetes? Try the OpenLIT Operator

Getting Started

Core Features

Integrations

Destinations

​Choose your method

OpenLIT SDK

OpenTelemetry GPU Collector

​Supported Parameters

​SDK Configuration Options

​Environment Variables

Deploy OpenLIT

Integrations

Destinations

Running in Kubernetes? Try the OpenLIT Operator

Choose your method

Supported Parameters

SDK Configuration Options

Environment Variables