Choose your method
GPU monitoring can be implemented in two ways depending on your setup and requirements:OpenLIT SDK
It is useful if you already have an AI application running on GPU that’s instrumented with OpenLIT.It extends your existing observability to include GPU metrics alongside your LLM traces.
OpenTelemetry GPU Collector
It is useful for remote GPUs with only LLM models hosted, containerized deployments.This approach allows you to get GPU metrics without modifying application code.
Supported Parameters
SDK Configuration Options
Parameter | Type | Default | Description |
---|---|---|---|
collect_system_metrics | boolean | False | Enable GPU and system metrics collection |
otlp_endpoint | string | None | OpenTelemetry OTLP endpoint URL |
otlp_headers | string | None | Authentication headers for OTLP endpoint |
service_name | string | "unknown_service" | Name of your AI application |
environment | string | None | Deployment environment (dev, staging, prod) |
Environment Variables
Variable | Description | Example |
---|---|---|
OPENLIT_COLLECT_SYSTEM_METRICS | Enable GPU monitoring | true |
OTEL_EXPORTER_OTLP_ENDPOINT | OTLP endpoint URL | http://127.0.0.1:4318 |
OTEL_SERVICE_NAME | Service name for telemetry | my-gpu-app |
OTEL_DEPLOYMENT_ENVIRONMENT | Deployment environment | production |
Deploy OpenLIT
Deployment options for scalable LLM monitoring infrastructure
Integrations
60+ AI integrations with automatic instrumentation and performance tracking
Destinations
Send telemetry to Datadog, Grafana, New Relic, and other observability stacks
Running in Kubernetes? Try the OpenLIT Operator
Automatically inject instrumentation into existing workloads without modifying pod specs, container images, or application code.