OpenLIT provides OpenTelemetry Auto instrumentation for various LLM providers, frameworks, and VectorDBs, providing you with insights into the behavior and performance of your LLM applications.

This documentation covers tracing settings, understanding semantic conventions, and interpreting span attributes to enhance the monitoring and observability of your LLM applications.

Using an existing OTel Tracer

You have the flexibility to integrate your existing OpenTelemetry (OTel) tracer configuration with OpenLIT. If you already have an OTel tracer instantiated in your application, you can pass it directly to openlit.init(tracer=tracer). This integration ensures that OpenLIT utilizes your custom tracer settings, allowing for a unified tracing setup across your application.

Example:

# Instantiate an OpenTelemetry Tracer
tracer = ...

# Pass the tracer to OpenLIT
openlit.init(tracer=tracer)

Disable Tracing of Content

By default, OpenLIT adds the prompts and completions to Trace span attributes.

However, you may want to disable this logging for privacy reasons, as they may contain highly sensitive data from your users. You may also simply want to reduce the size of your traces.

Example:

openlit.init(trace_content=False)

Disable Batch

By default, the SDK batches spans using the OpenTelemetry batch span processor. When working locally, sometimes you may wish to disable this behavior. You can do that with this flag.

Example:

openlit.init(disable_batch=True)

Disable Instrumentations

By default, OpenLIT automatically detects which models and frameworks you are using and instruments them for you. You can override this and disable instrumentation for specific frameworks and models.

Example:

openlit.init(disabled_instrumentors=["anthropic", "langchain"])

Manual Tracing

Using openlit.trace, you get access to manually create traces, allowing you to record every process within a single function.

python

@openlit.trace
def generate_one_liner():
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Return a one liner from any movie for me to guess",
            }
        ],
    )

The trace function automatically groups any LLM function invoked within generate_one_liner, providing you with organized groupings right out of the box.

You can do more with traces by running the start_trace context generator:

python
with openlit.start_trace(name="<GIVE_TRACE_A_NAME>") as trace:
    # your code

Use trace.set_result('') to set the final result of the trace and trace.set_metadata({}) to add custom metadata.

Full Example

python
@openlit.trace
def generate_one_liner():
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Return a one liner from any movie for me to guess",
            }
        ],
    )

def guess_one_liner(one_liner: str):
    with openlit.start_trace("Guess One-liner") as trace:
        completion = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "user",
                    "content": f"Guess movie from this line: {one_liner}",
                }
            ],
        )
        trace.set_result(completion.choices[0].message.content)

Semantic Convention

This section outlines the OpenTelemetry traces collected by OpenLIT from applications utilizing Large Language Models (LLMs) and Vector Databases. The span attributes adhere to the GenAI Semantic Conventions established by the OpenTelemetry community, ensuring standardized data collection that enhances monitoring and debugging capabilities.

Span Definitions

This section outlines fundamental definitions related to span types and names used within the OpenTelemetry traces to identify the nature and source of requests.

AttributeDescriptionExamples
Span kindWill always be CLIENT when the trace is generated successfully.CLIENT
Span nameIs set to the LLM endpoint which follows the convention provider_name.operation_name.openai.chat.completions

General Span Attributes

General span attributes encapsulate the environmental and operational context of LLM applications, offering insights into where and how the spans are generated.

AttributeTypeDescriptionExamples
gen_ai.endpointstringThe API endpoint used for the LLM request.cohere.chat
gen_ai.systemstringThe Generative AI product as identified by the client instrumentation.openai, cohere, anthropic, groq, huggingface, azure_openai, mistral, bedrock, vertexai, ollama, langchain, llama_index, haystack
gen_ai.environmentstringDeployment environment of the LLM.production
gen_ai.application_namestringThe name of the application using the LLM.chatbot_app
gen_ai.operation.namestringThe type of LLM operation performed.chat, embedding, audio, image, fine_tuning, vectordb, framework
gen_ai.hub.ownerstringThe owner of the prompt hub where the prompt is hosted.openai
gen_ai.hub.repostringThe repository within the hub from which the model is used.gpt-4
gen_ai.retrieval.sourcestringSource from which the model was retrieved.slack
telemetry.sdk.namestringSource of the generated tracesopenlit

GenAI/LLM Span Attributes

Request Attributes

These attributes detail the specifics of requests sent to LLMs, including configurations and parameters that define the behavior and responses of the LLM.

AttributeTypeDescriptionExamples
gen_ai.request.modelstringThe name of the LLM a request is being made to.gpt-4
gen_ai.request.temperaturedoubleThe temperature setting for the LLM request.0.0
gen_ai.request.top_pdoubleThe top_p sampling setting for the LLM request.1.0
gen_ai.request.top_kdoubleThe top_k sampling setting for the LLM request.1.0
gen_ai.request.max_tokensintThe maximum number of tokens the LLM generates for a request.100
gen_ai.request.is_streambooleanIndicates if the request to LLM is a stream.true
gen_ai.request.userstringUsername or identifier for the user making the request.user123
gen_ai.request.seedintSeed used to generate deterministic results from the LLM.42
gen_ai.request.frequency_penaltydoubleFrequency penalty parameter in the LLM request.0.5
gen_ai.request.presence_penaltydoublePresence penalty parameter in the LLM request.0.5
gen_ai.request.embedding_formatstringFormat of embeddings requested from the LLM.json
gen_ai.request.embedding_dimensionintDimensionality of the generated embeddings.512
gen_ai.request.tool_choicestringTool or library chosen for processing the LLM request.PyTorch
gen_ai.request.audio_voicestringVoice setting for audio generated by LLM.alto
gen_ai.request.audio_response_formatstringThe format of the audio response from the LLM.mp3
gen_ai.request.audio_speeddoubleSpeed setting for audio responses generated by the LLM.1.0
gen_ai.request.fine_tune_statusstringStatus or mode of finetuning the model.active
gen_ai.request.fine_tune_model_suffixstringSuffix describing specifics of the finetuned model version.v2
gen_ai.request.fine_tune_n_epochsintNumber of epochs used in training the finetuned model.4
gen_ai.request.learning_rate_multiplierdoubleLearning rate multiplier used in finetuning.0.7
gen_ai.request.fine_tune_batch_sizeintBatch size used in the finetuning process.16
gen_ai.request.validation_filestringLocation or type of validation file used in the training process.val_data.json
gen_ai.request.training_filestringLocation or type of training file used for the LLM.train_data.json
gen_ai.request.image_sizestringSize specifications for the generated image.1024x768
gen_ai.request.image_qualityintQuality parameter for the generated image, usually a percentage.90
gen_ai.request.image_stylestringStyle or theme of the generated image.van_gogh

Response Attributes

Attributes in this category help trace and understand responses from LLMs, encapsulating data relevant to outcomes and any content generated as a result.

AttributeTypeDescriptionExamples
gen_ai.response.idstringThe unique identifier for the response.resp-456
gen_ai.response.finish_reasonsstringReasons the model stopped generating tokens.stop
gen_ai.response.imagestringThe image content generated by the LLM.<image URL or base64 string>

Usage Attributes

These attributes track the consumption and costs associated with LLM requests, facilitating operational analysis and resource management.

AttributeTypeDescriptionExamples
gen_ai.usage.prompt_tokensintThe number of tokens used in the LLM prompt.100
gen_ai.usage.completion_tokensintThe number of tokens used in the LLM response (completion).180
gen_ai.usage.total_tokensintThe total number of tokens used in both the prompt and completion.280
gen_ai.usage.costdecimalThe cost (in USD) associated with the LLM request0.05

Content Attributes

Content attributes specifically relate to the data exchanged with LLMs during requests and responses, including initial prompts and resulting outputs.

AttributeTypeDescriptionExample Value
gen_ai.promptstringThe full prompt sent to the LLM."What is the capital of France?"
gen_ai.completionstringThe full response received from the LLM."The capital of France is Paris."
gen_ai.content.revised_promptstringRevised or modified prompt, if applicable."What's the capital city of France?"

VectorDB Span Attributes

This section addresses attributes related to interactions with Vector Databases supporting the LLM operations, providing insights into database operations and related parameters.

AttributeTypeDescriptionExample Value
db.systemstringSystem type of the database.pinecone, chroma, qdrant, milvus
db.collection.namestringName of the collection within the database.user_profiles
db.operationstringType of database operation performed.query, delete, update, upsert,add, peek, create_index, create_collection, update_collection, delete_collection
db.operation.statusstringStatus result of the operation (e.g., success, error).completed
db.operation.costdecimalCost associated with the operation, if applicable.0.0
db.ids_countintCount of ids processed or retrieved in an operation.150
db.vector_countintNumber of vectors involved in the operation.320
db.metadatas_countintCount of metadata entries retrieved or modified.120
db.documents_countintThe number of documents involved in the database operation.85
db.payload_countintPayload count associated with the operation.45
db.limitintLimit number for query results.50
db.offsetintThe offset for the starting point of the query results.10
db.where_documentstringFilter document condition used in operations like query or update."age > 30"
db.filterstringApplied filters on database query."{ 'status': 'active' }"
db.statementstringRaw database query statement.[0.1, 0.3]
db.n_resultsintNumber of results returned from a query operation.15
db.delete_allbooleanIndicates if all records are to be deleted in the operation.false
db.index.namestringName of the index used or modified.user_index
db.index.dimensionintDimension of the index involved in the database operation.128
db.collection.dimensionintDimensionality configuration of the collection.128
db.create_index.metricstringMetric type used for creating indexes."euclidean"
db.create_index.specstringSpecifications of the index created."type: kd-tree; leaf_size: 40"
db.query.namespacestringNamespace used for structuring database queries."product_catalog"
db.update.metadatastringMetadata being updated as part of the operation."category: electronics"
db.update.valuesstringSpecifies the new values being applied in an update operation."{ 'price': '99.99' }"
db.update.idstringUnique identifier for the entry being updated."product123"