OpenLIT offers automatic instrumentation with OpenTelemetry for various LLM providers, frameworks, and VectorDBs, enabling you to gain valuable insights into the behavior and performance of your LLM applications through metrics.

This documentation covers metrics settings, understanding semantic conventions, and interpreting metric attributes, empowering you to enhance the monitoring and observability of your LLM applications.

Disable Metrics

You have the option to disable the collection of metrics if needed. By default, metrics collection is enabled.

Example:

# Disable metrics collection
openlit.init(disable_metrics=True)

Using an existing OTel Metrics instance

You have the flexibility to integrate your existing OpenTelemetry (OTel) Metrics instance configuration with OpenLIT. If you already have an OTel Metrics instance instantiated in your application, you can pass it directly to openlit.init(meter=meter). This integration ensures that OpenLIT utilizes your custom OTel metrics instance settings, allowing for a unified metrics setup across your application.

Example:

# Instantiate an OpenTelemetry Metrics meter
meter = ...

# Pass the meter to OpenLIT
openlit.init(meter=meter)

Semantic Convention

This section outlines the OpenTelemetry metrics collected by OpenLIT from applications using LLMs and Vector Databases. These metrics offer a straightforward overview of application performance and resource usage. They serve as a supplement to the detailed data captured through tracing, aiding in the easy creation of dashboards for quick monitoring of system usage and performance.

GenAI/LLM Metrics

Metric NameDescriptionUnitTypeAttributes
gen_ai.total.requestsNumber of requests to the LLM.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.prompt_tokensNumber of prompt tokens processed.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.completion_tokensNumber of completion tokens processed.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.total_tokensTotal number of tokens processed.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.costThe cost distribution of LLM requests.USDHistogramtelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model

VectorDB Metrics

Metric NameDescriptionUnitTypeAttributes
db.total.requestsNumber of requests to VectorDBs.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.environment

GPU Metrics

Metric NameDescriptionUnitTypeAttributes
gpu.utilization_percentageGPU Utilization in percentagepercentGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.enc.utilization_percentageGPU encoder Utilization in percentagepercentGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.dec.utilization_percentageGPU decoder Utilization in percentagepercentGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.temperatureGPU Temperature in CelsiusCelciusGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.fan_speedGPU Fan Speed (0-100) as an integerIntegerGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.availableAvailable GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.totalTotal GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.usedUsed GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.freeFree GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.power.drawGPU Power Draw in WattsWattGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.power.limitGPU Power Limit in WattsWattGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid