InstrumentedClient that automatically emits traces and metrics for every API call — with zero changes to your application logic.
The integration supports:
- Messages (standard and streaming)
- Prompt caching token tracking (
cache_creation_input_tokens,cache_read_input_tokens) - Tool use
Get started
Initialize OpenLIT
Add this once at the start of your application (e.g. in Replace
main()):- Via Config Struct
- Via Environment Variable
YOUR_OTEL_ENDPOINT with the URL of your OpenTelemetry backend, such as http://127.0.0.1:4318 for a local OpenLIT deployment.Create an instrumented client
Replace your existing Anthropic client creation with the OpenLIT instrumented client:Optional configuration:
What gets collected
Every call to the instrumented client automatically records:| Data | Attribute |
|---|---|
| Operation name | gen_ai.operation.name |
| Model requested | gen_ai.request.model |
| Model used | gen_ai.response.model |
| Response ID | gen_ai.response.id |
| Input tokens | gen_ai.usage.input_tokens |
| Output tokens | gen_ai.usage.output_tokens |
| Cache creation tokens | gen_ai.usage.prompt_tokens_details.cache_write |
| Cache read tokens | gen_ai.usage.prompt_tokens_details.cache_read |
| Estimated cost | gen_ai.usage.cost |
| Finish reason | gen_ai.response.finish_reasons |
| Tool calls | gen_ai.tool.name, gen_ai.tool.call.id |
| Time to first token | gen_ai.server.time_to_first_token (streaming) |
| Time per output token | gen_ai.server.time_per_output_token (streaming) |
gen_ai.client.token.usage— token usage histogram (input/output)gen_ai.client.operation.duration— total operation durationgen_ai.server.time_to_first_token— TTFT for streaminggen_ai.client.operation.time_to_first_chunk— client-side TTFTgen_ai.client.operation.time_per_output_chunk— per-chunk latencygen_ai.server.request.duration— estimated server processing time

