OpenLIT offers automatic instrumentation with OpenTelemetry for various LLM providers, frameworks, and VectorDBs, enabling you to gain valuable insights into the behavior and performance of your LLM applications through metrics.

This documentation covers metrics settings, understanding semantic conventions, and interpreting metric attributes, empowering you to enhance the monitoring and observability of your LLM applications.

Disable Metrics

You have the option to disable the collection of metrics if needed. By default, metrics collection is enabled.

Example:

# Disable metrics collection
openlit.init(disable_metrics=True)

Using an existing OTel Metrics instance

You have the flexibility to integrate your existing OpenTelemetry (OTel) Metrics instance configuration with OpenLIT. If you already have an OTel Metrics instance instantiated in your application, you can pass it directly to openlit.init(meter=meter). This integration ensures that OpenLIT utilizes your custom OTel metrics instance settings, allowing for a unified metrics setup across your application.

Example:

# Instantiate an OpenTelemetry Metrics meter
meter = ...

# Pass the meter to OpenLIT
openlit.init(meter=meter)

Add custom resource attributes

The OTEL_RESOURCE_ATTRIBUTES environment variable allows you to provide additional OpenTelemetry resource attributes when starting your application with OpenLIT. OpenLIT already includes some default resource attributes:

  • telemetry.sdk.name: openlit
  • service.name: YOUR_SERVICE_NAME
  • deployment.environment: YOUR_ENVIRONMENT_NAME

You can enhance these default resource attributes by adding your own using the OTEL_RESOURCE_ATTRIBUTES variable. Your custom attributes will be added on top of the existing OpenLIT attributes, providing additional context to your telemetry data. Simply format your attributes as key1=value1,key2=value2.

For example:

export OTEL_RESOURCE_ATTRIBUTES="service.instance.id=YOUR_SERVICE_ID,k8s.pod.name=K8S_POD_NAME,k8s.namespace.name=K8S_NAMESPACE,k8s.node.name=K8S_NODE_NAME"

Semantic Convention

This section outlines the OpenTelemetry metrics collected by OpenLIT from applications using LLMs and Vector Databases. These metrics offer a straightforward overview of application performance and resource usage. They serve as a supplement to the detailed data captured through tracing, aiding in the easy creation of dashboards for quick monitoring of system usage and performance.

GenAI/LLM Metrics

Metric NameDescriptionUnitTypeAttributes
gen_ai.total.requestsNumber of requests to the LLM.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.input_tokensNumber of input tokens processed.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.output_tokensNumber of output tokens processed.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.total_tokensTotal number of tokens processed.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model
gen_ai.usage.costThe cost distribution of LLM requests.USDHistogramtelemetry.sdk.name, gen_ai.application_name, gen_ai.system, gen_ai.environment, gen_ai.operation.name, gen_ai.request.model

VectorDB Metrics

Metric NameDescriptionUnitTypeAttributes
db.total.requestsNumber of requests to VectorDBs.1Countertelemetry.sdk.name, gen_ai.application_name, gen_ai.environment

GPU Metrics

Metric NameDescriptionUnitTypeAttributes
gpu.utilizationGPU Utilization in percentagepercentGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.enc.utilizationGPU encoder Utilization in percentagepercentGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.dec.utilizationGPU decoder Utilization in percentagepercentGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.temperatureGPU Temperature in CelsiusCelsiusGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.fan_speedGPU Fan Speed (0-100) as an integerIntegerGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.availableAvailable GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.totalTotal GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.usedUsed GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.memory.freeFree GPU Memory in MBMBGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.power.drawGPU Power Draw in WattsWattGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid
gpu.power.limitGPU Power Limit in WattsWattGaugetelemetry.sdk.name, gen_ai.application_name, gen_ai.environment, gpu_index, gpu_name, gpu_uuid