Tracing

OpenLIT provides OpenTelemetry Auto instrumentation for various LLM providers, frameworks, and VectorDBs, providing you with insights into the behavior and performance of your LLM applications. This documentation covers tracing settings, understanding semantic conventions, and interpreting span attributes to enhance the monitoring and observability of your LLM applications.

Quickstart

Get Started with monitoring your AI Applications in 2 simple steps

Connections

Connect to your existing Observablity Stack

Using an existing OTel Tracer

You have the flexibility to integrate your existing OpenTelemetry (OTel) tracer configuration with OpenLIT. If you already have an OTel tracer instantiated in your application, you can pass it directly to openlit.init(tracer=tracer). This integration ensures that OpenLIT utilizes your custom tracer settings, allowing for a unified tracing setup across your application. Example:

# Instantiate an OpenTelemetry Tracer
tracer = ...

# Pass the tracer to OpenLIT
openlit.init(tracer=tracer)

Add custom resource attributes

The OTEL_RESOURCE_ATTRIBUTES environment variable allows you to provide additional OpenTelemetry resource attributes when starting your application with OpenLIT. OpenLIT already includes some default resource attributes:

telemetry.sdk.name: openlit
service.name: YOUR_SERVICE_NAME
deployment.environment: YOUR_ENVIRONMENT_NAME

You can enhance these default resource attributes by adding your own using the OTEL_RESOURCE_ATTRIBUTES variable. Your custom attributes will be added on top of the existing OpenLIT attributes, providing additional context to your telemetry data. Simply format your attributes as key1=value1,key2=value2. For example:

export OTEL_RESOURCE_ATTRIBUTES="service.instance.id=YOUR_SERVICE_ID,k8s.pod.name=K8S_POD_NAME,k8s.namespace.name=K8S_NAMESPACE,k8s.node.name=K8S_NODE_NAME"

Disable Tracing of Content

By default, OpenLIT adds the prompts and completions to Trace span attributes. However, you may want to disable this logging for privacy reasons, as they may contain highly sensitive data from your users. You may also simply want to reduce the size of your traces. Example:

openlit.init(capture_message_content=False)

Disable Batch

By default, the SDK batches spans using the OpenTelemetry batch span processor. When working locally, sometimes you may wish to disable this behavior. You can do that with this flag. Example:

openlit.init(disable_batch=True)

Disable Instrumentations

By default, OpenLIT automatically detects which models and frameworks you are using and instruments them for you. You can override this and disable instrumentation for specific frameworks and models. Example:

openlit.init(disabled_instrumentors=["anthropic", "langchain"])

Manual Tracing

Using openlit.trace, you get access to manually create traces, allowing you to record every process within a single function.

python

@openlit.trace
def generate_one_liner():
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Return a one liner from any movie for me to guess",
            }
        ],
    )

The trace function automatically groups any LLM function invoked within generate_one_liner, providing you with organized groupings right out of the box. You can do more with traces by running the start_trace context generator:

python

with openlit.start_trace(name="<GIVE_TRACE_A_NAME>") as trace:
    # your code

Use trace.set_result('') to set the final result of the trace and trace.set_metadata({}) to add custom metadata. Full Example

python

@openlit.trace
def generate_one_liner():
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Return a one liner from any movie for me to guess",
            }
        ],
    )

def guess_one_liner(one_liner: str):
    with openlit.start_trace("Guess One-liner") as trace:
        completion = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "user",
                    "content": f"Guess movie from this line: {one_liner}",
                }
            ],
        )
        trace.set_result(completion.choices[0].message.content)

Semantic Convention

This section outlines the OpenTelemetry traces collected by OpenLIT from applications utilizing Large Language Models (LLMs) and Vector Databases. The span attributes adhere to the GenAI Semantic Conventions established by the OpenTelemetry community, ensuring standardized data collection that enhances monitoring and debugging capabilities.

Span Definitions

This section outlines fundamental definitions related to span types and names used within the OpenTelemetry traces to identify the nature and source of requests.

Attribute	Description	Examples
Span kind	Will always be `CLIENT` when the trace is generated successfully.	`CLIENT`
Span name	Is set to the LLM endpoint which follows the convention `provider_name.operation_name`.	`openai.chat.completions`

General Span Attributes

General span attributes encapsulate the environmental and operational context of AI applications, offering insights into where and how the spans are generated.

Attribute	Type	Description	Examples
`gen_ai.endpoint`	string	The API endpoint used for the LLM request.	`cohere.chat`
`gen_ai.system`	string	The Generative AI product as identified by the client instrumentation.	`openai`, `cohere`, `anthropic`, `groq`, `huggingface`, `azure_openai`, `mistral`, `bedrock`, `vertexai`, `ollama`, `langchain`, `llama_index`, `haystack`
`gen_ai.environment`	string	Deployment environment of the LLM.	`production`
`gen_ai.application_name`	string	The name of the application using the LLM.	`chatbot_app`
`gen_ai.operation.name`	string	The type of LLM operation performed.	`chat`, `embedding`, `audio`, `image`, `fine_tuning`, `vectordb`, `framework`
`gen_ai.hub.owner`	string	The owner of the prompt hub where the prompt is hosted.	`openai`
`gen_ai.hub.repo`	string	The repository within the hub from which the model is used.	`gpt-4`
`gen_ai.retrieval.source`	string	Source from which the model was retrieved.	`slack`
`telemetry.sdk.name`	string	Source of the generated traces	`openlit`

GenAI/LLM Span Attributes

Request Attributes

These attributes detail the specifics of requests sent to LLMs, including configurations and parameters that define the behavior and responses of the LLM.

Attribute	Type	Description	Examples
`gen_ai.request.model`	string	The name of the LLM a request is being made to.	`gpt-4`
`gen_ai.request.temperature`	double	The temperature setting for the LLM request.	`0.0`
`gen_ai.request.top_p`	double	The top_p sampling setting for the LLM request.	`1.0`
`gen_ai.request.top_k`	double	The top_k sampling setting for the LLM request.	`1.0`
`gen_ai.request.max_tokens`	int	The maximum number of tokens the LLM generates for a request.	`100`
`gen_ai.request.is_stream`	boolean	Indicates if the request to LLM is a stream.	`true`
`gen_ai.request.user`	string	Username or identifier for the user making the request.	`user123`
`gen_ai.request.seed`	int	Seed used to generate deterministic results from the LLM.	`42`
`gen_ai.request.frequency_penalty`	double	Frequency penalty parameter in the LLM request.	`0.5`
`gen_ai.request.presence_penalty`	double	Presence penalty parameter in the LLM request.	`0.5`
`gen_ai.request.embedding_format`	string	Format of embeddings requested from the LLM.	`json`
`gen_ai.request.embedding_dimension`	int	Dimensionality of the generated embeddings.	`512`
`gen_ai.request.tool_choice`	string	Tool or library chosen for processing the LLM request.	`PyTorch`
`gen_ai.request.audio_voice`	string	Voice setting for audio generated by LLM.	`alto`
`gen_ai.request.audio_response_format`	string	The format of the audio response from the LLM.	`mp3`
`gen_ai.request.audio_speed`	double	Speed setting for audio responses generated by the LLM.	`1.0`
`gen_ai.request.fine_tune_status`	string	Status or mode of finetuning the model.	`active`
`gen_ai.request.fine_tune_model_suffix`	string	Suffix describing specifics of the finetuned model version.	`v2`
`gen_ai.request.fine_tune_n_epochs`	int	Number of epochs used in training the finetuned model.	`4`
`gen_ai.request.learning_rate_multiplier`	double	Learning rate multiplier used in finetuning.	`0.7`
`gen_ai.request.fine_tune_batch_size`	int	Batch size used in the finetuning process.	`16`
`gen_ai.request.validation_file`	string	Location or type of validation file used in the training process.	`val_data.json`
`gen_ai.request.training_file`	string	Location or type of training file used for the LLM.	`train_data.json`
`gen_ai.request.image_size`	string	Size specifications for the generated image.	`1024x768`
`gen_ai.request.image_quality`	int	Quality parameter for the generated image, usually a percentage.	`90`
`gen_ai.request.image_style`	string	Style or theme of the generated image.	`van_gogh`

Response Attributes

Attributes in this category help trace and understand responses from LLMs, encapsulating data relevant to outcomes and any content generated as a result.

Attribute	Type	Description	Examples
`gen_ai.response.id`	string	The unique identifier for the response.	`resp-456`
`gen_ai.response.finish_reasons`	string	Reasons the model stopped generating tokens.	`stop`
`gen_ai.response.image`	string	The image content generated by the LLM.	`<image URL or base64 string>`

Usage Attributes

These attributes track the consumption and costs associated with LLM requests, facilitating operational analysis and resource management.

Attribute	Type	Description	Examples
`gen_ai.usage.input_tokens`	int	The number of tokens used in the LLM input.	`100`
`gen_ai.usage.output_tokens`	int	The number of tokens used in the LLM output.	`180`
`gen_ai.usage.total_tokens`	int	The total number of tokens used in both the prompt and completion.	`280`
`gen_ai.usage.cost`	decimal	The cost (in USD) associated with the LLM request	`0.05`

Content Attributes

Content attributes specifically relate to the data exchanged with LLMs during requests and responses, including initial prompts and resulting outputs.

Attribute	Type	Description	Example Value
`gen_ai.prompt`	string	The full prompt sent to the LLM.	`"What is the capital of France?"`
`gen_ai.completion`	string	The full response received from the LLM.	`"The capital of France is Paris."`
`gen_ai.content.revised_prompt`	string	Revised or modified prompt, if applicable.	`"What's the capital city of France?"`

VectorDB Span Attributes

This section addresses attributes related to interactions with Vector Databases supporting the LLM operations, providing insights into database operations and related parameters.

Attribute	Type	Description	Example Value
`db.system`	string	System type of the database.	`pinecone`, `chroma`, `qdrant`, `milvus`
`db.collection.name`	string	Name of the collection within the database.	`user_profiles`
`db.operation`	string	Type of database operation performed.	`query`, `delete`, `update`, `upsert`,`add`, `peek`, `create_index`, `create_collection`, `update_collection`, `delete_collection`
`db.operation.status`	string	Status result of the operation (e.g., success, error).	`completed`
`db.operation.cost`	decimal	Cost associated with the operation, if applicable.	`0.0`
`db.ids_count`	int	Count of ids processed or retrieved in an operation.	`150`
`db.vector_count`	int	Number of vectors involved in the operation.	`320`
`db.metadatas_count`	int	Count of metadata entries retrieved or modified.	`120`
`db.documents_count`	int	The number of documents involved in the database operation.	`85`
`db.payload_count`	int	Payload count associated with the operation.	`45`
`db.limit`	int	Limit number for query results.	`50`
`db.offset`	int	The offset for the starting point of the query results.	`10`
`db.where_document`	string	Filter document condition used in operations like query or update.	`"age > 30"`
`db.filter`	string	Applied filters on database query.	`"{ 'status': 'active' }"`
`db.statement`	string	Raw database query statement.	`[0.1, 0.3]`
`db.n_results`	int	Number of results returned from a query operation.	`15`
`db.delete_all`	boolean	Indicates if all records are to be deleted in the operation.	`false`
`db.index.name`	string	Name of the index used or modified.	`user_index`
`db.index.dimension`	int	Dimension of the index involved in the database operation.	`128`
`db.collection.dimension`	int	Dimensionality configuration of the collection.	`128`
`db.create_index.metric`	string	Metric type used for creating indexes.	`"euclidean"`
`db.create_index.spec`	string	Specifications of the index created.	`"type: kd-tree; leaf_size: 40"`
`db.query.namespace`	string	Namespace used for structuring database queries.	`"product_catalog"`
`db.update.metadata`	string	Metadata being updated as part of the operation.	`"category: electronics"`
`db.update.values`	string	Specifies the new values being applied in an update operation.	`"{ 'price': '99.99' }"`
`db.update.id`	string	Unique identifier for the entry being updated.	`"product123"`

Integrations

Integrate your AI Stack with OpenLIT

Connections

Connect to your existing Observablity Stack

Introduction

Features

Integrations

Connections

API Reference

Privacy

Quickstart

Connections

Using an existing OTel Tracer

Add custom resource attributes

Disable Tracing of Content

Disable Batch

Disable Instrumentations

Manual Tracing

Semantic Convention

Span Definitions

General Span Attributes

GenAI/LLM Span Attributes

Request Attributes

Response Attributes

Usage Attributes

Content Attributes

VectorDB Span Attributes

Integrations

Connections

Introduction

Features

Integrations

Connections

API Reference

Privacy

Quickstart

Connections

​Using an existing OTel Tracer

​Add custom resource attributes

​Disable Tracing of Content

​Disable Batch

​Disable Instrumentations

​Manual Tracing

​Semantic Convention

​Span Definitions

​General Span Attributes

​GenAI/LLM Span Attributes

​Request Attributes

​Response Attributes

​Usage Attributes

​Content Attributes

​VectorDB Span Attributes

Integrations

Connections

Using an existing OTel Tracer

Add custom resource attributes

Disable Tracing of Content

Disable Batch

Disable Instrumentations

Manual Tracing

Semantic Convention

Span Definitions

General Span Attributes

GenAI/LLM Span Attributes

Request Attributes

Response Attributes

Usage Attributes

Content Attributes

VectorDB Span Attributes