Monitor AI Agent Governance using OpenTelemetry

Monitor your AI agent governance pipeline with Agent Governance Toolkit and OpenLIT. Surface policy evaluations, trust score changes, capability violations, and audit integrity alongside your existing AI observability metrics.

Overview

Agent Governance Toolkit emits OpenTelemetry signals for every governance decision in your AI agent system:

Signal	Description	OTel Type
`governance.policy.evaluation`	Policy check result (allow/deny/warn) with latency	Span
`governance.trust.score_change`	Agent trust score movement with reason	Span
`governance.capability.request`	Tool/capability access attempt (granted/denied)	Span
`governance.violation`	Policy violation event with severity	Span + Event
`governance.policy.eval_count`	Total policy evaluations	Counter
`governance.policy.violation_count`	Total violations by severity	Counter
`governance.policy.eval_latency_ms`	Policy evaluation latency	Histogram
`governance.trust.score`	Current agent trust scores	Gauge

Get Started

Install packages

pip install openlit agent-governance-toolkit

Initialize OpenLIT and the governance tracer

Add both initializations at your application entry point:

import openlit
from agent_os.telemetry import GovernanceTracer

# Initialize OpenLIT (sends to OpenLIT platform by default)
openlit.init()

# Initialize governance tracing (uses the same OTEL exporter)
tracer = GovernanceTracer(service_name="my-agent-system")

Instrument governance decisions

The toolkit automatically traces policy evaluations. You can also emit custom governance spans:

from agent_os.policies import PolicyEvaluator

evaluator = PolicyEvaluator()
evaluator.load_yaml("""
name: production-safety
rules:
  - name: block-dangerous-tools
    condition: "tool_name in ['rm_rf', 'drop_database', 'send_email_all']"
    action: deny
    description: "Block dangerous tool access"
""")

# Every evaluate() call emits a governance.policy.evaluation span
decision = evaluator.evaluate({"tool_name": "read_file", "agent_id": "agent-1"})

View in OpenLIT

Governance signals appear in OpenLIT alongside your LLM traces:

Traces → Filter by governance.* span names to see policy decisions
Metrics → governance.policy.eval_count, governance.policy.violation_count
Dashboards → Create custom panels for governance health (see below)

Custom Dashboard: Governance Health

Create a governance-specific dashboard in OpenLIT to monitor:

Policy evaluation rate — How many governance checks per minute
Violation trend — Spikes in policy violations over time
Evaluation latency — Are governance checks meeting the <5ms SLA
Trust score distribution — Current trust scores across your agent fleet

Use the Dashboard Builder with these metric names:

Widget	Metric	Type
Eval rate	`governance.policy.eval_count`	Stat
Violations over time	`governance.policy.violation_count`	Line chart
P99 eval latency	`governance.policy.eval_latency_ms`	Area chart
Trust scores	`governance.trust.score`	Bar chart

Prometheus Metrics

If you use Prometheus + Grafana instead of (or alongside) OpenLIT, the toolkit also exposes a Prometheus endpoint:

from agent_os.telemetry import GovernanceMetrics

# Bind to localhost to avoid unintended network exposure.
# Adjust host/port as needed for your deployment.
metrics = GovernanceMetrics(host="127.0.0.1", port=9090)
metrics.start()

# Metrics auto-update as governance decisions flow
# Scrape http://localhost:9090/metrics

OpenLIT Dashboard

A pre-built Governance Health dashboard is available for import into OpenLIT. It provides:

Total Policy Evaluations — count with trend
Total Violations — count with severity breakdown
Violations Over Time — time-series chart
Policy Eval Latency — p50/p99 latency
Violations by Agent — which agents trigger the most violations
Recent Violations — audit table with agent, tool, policy, verdict

Import via OpenLIT → Dashboards → Import using the JSON below:

Governance Health Dashboard JSON

{
  "id": "a1c2d3e4-f5a6-4b7c-8d9e-0f1a2b3c4d5e",
  "title": "Governance Health",
  "description": "Monitor AI agent governance — policy evaluations, violations, trust scores, and enforcement latency across your agent fleet.",
  "parentId": null,
  "isMainDashboard": false,
  "isPinned": false,
  "widgets": {
    "b2d4f6a8-1c3e-4a5b-8d7f-9e0a1b2c3d4e": {
      "id": "b2d4f6a8-1c3e-4a5b-8d7f-9e0a1b2c3d4e",
      "title": "Total Policy Evaluations",
      "description": "Total number of governance policy evaluations in the selected time period with trend comparison",
      "type": "STAT_CARD",
      "properties": {
        "value": "0.total_evaluations",
        "color": "#4CAF50",
        "trend": "0.rate",
        "trendSuffix": "%"
      },
      "config": {
        "query": "WITH\n    parseDateTimeBestEffort('{{filter.timeLimit.start}}') AS start_time,\n    parseDateTimeBestEffort('{{filter.timeLimit.end}}') AS end_time,\n    (end_time - start_time) AS duration,\n    (start_time - duration) AS prev_start_time,\n    (end_time - duration) AS prev_end_time\n\nSELECT\n    CAST(countIf(\n        Timestamp >= start_time AND Timestamp <= end_time\n    ) AS INTEGER) AS total_evaluations,\n    CAST(countIf(\n        Timestamp >= prev_start_time AND Timestamp <= prev_end_time\n    ) AS INTEGER) AS previous_total_evaluations,\n    round(\n        if(\n            countIf(Timestamp >= prev_start_time AND Timestamp <= prev_end_time) = 0,\n            countIf(Timestamp >= start_time AND Timestamp <= end_time) * 100.0,\n            (countIf(Timestamp >= start_time AND Timestamp <= end_time) - countIf(Timestamp >= prev_start_time AND Timestamp <= prev_end_time)) /\n            countIf(Timestamp >= prev_start_time AND Timestamp <= prev_end_time) * 100.0\n        ), 4\n    ) AS rate\nFROM otel_traces\nWHERE SpanName = 'governance.policy.evaluation'\n    AND Timestamp >= prev_start_time AND Timestamp <= end_time\n"
      }
    },
    "c3e5a7b9-2d4f-4c6a-9e8b-0f1a2b3c4d5f": {
      "id": "c3e5a7b9-2d4f-4c6a-9e8b-0f1a2b3c4d5f",
      "title": "Total Violations",
      "description": "Total number of governance violations detected in the selected time period with trend comparison",
      "type": "STAT_CARD",
      "properties": {
        "value": "0.total_violations",
        "color": "#F44336",
        "trend": "0.rate",
        "trendSuffix": "%"
      },
      "config": {
        "query": "WITH\n    parseDateTimeBestEffort('{{filter.timeLimit.start}}') AS start_time,\n    parseDateTimeBestEffort('{{filter.timeLimit.end}}') AS end_time,\n    (end_time - start_time) AS duration,\n    (start_time - duration) AS prev_start_time,\n    (end_time - duration) AS prev_end_time\n\nSELECT\n    CAST(countIf(\n        Timestamp >= start_time AND Timestamp <= end_time\n    ) AS INTEGER) AS total_violations,\n    CAST(countIf(\n        Timestamp >= prev_start_time AND Timestamp <= prev_end_time\n    ) AS INTEGER) AS previous_total_violations,\n    round(\n        if(\n            countIf(Timestamp >= prev_start_time AND Timestamp <= prev_end_time) = 0,\n            countIf(Timestamp >= start_time AND Timestamp <= end_time) * 100.0,\n            (countIf(Timestamp >= start_time AND Timestamp <= end_time) - countIf(Timestamp >= prev_start_time AND Timestamp <= prev_end_time)) /\n            countIf(Timestamp >= prev_start_time AND Timestamp <= prev_end_time) * 100.0\n        ), 4\n    ) AS rate\nFROM otel_traces\nWHERE SpanName = 'governance.violation'\n    AND Timestamp >= prev_start_time AND Timestamp <= end_time\n"
      }
    },
    "d4f6b8ca-3e5a-4d7b-ae9c-1f2a3b4c5d6e": {
      "id": "d4f6b8ca-3e5a-4d7b-ae9c-1f2a3b4c5d6e",
      "title": "Violations Over Time",
      "description": "Time-series chart showing governance violations grouped by time bucket",
      "type": "LINE_CHART",
      "properties": {
        "xAxis": "request_time",
        "yAxis": "violation_count",
        "color": "#F44336"
      },
      "config": {
        "query": "WITH\n    parseDateTimeBestEffort('{{filter.timeLimit.start}}') AS start_time,\n    parseDateTimeBestEffort('{{filter.timeLimit.end}}') AS end_time,\n    dateDiff('day', start_time, end_time) AS days_diff,\n    dateDiff('year', start_time, end_time) AS years_diff,\n    multiIf(years_diff >= 1, 'month', days_diff <= 1, 'hour', 'day') AS date_granularity\n\nSELECT\n    CAST(COUNT(*) AS INTEGER) AS violation_count,\n    formatDateTime(DATE_TRUNC(date_granularity, Timestamp), '%Y/%m/%d %R') AS request_time\nFROM otel_traces\nWHERE SpanName = 'governance.violation'\n    AND Timestamp >= start_time AND Timestamp <= end_time\nGROUP BY request_time\nORDER BY request_time"
      }
    },
    "e5a7c9db-4f6b-4e8c-bf0d-2a3b4c5d6e7f": {
      "id": "e5a7c9db-4f6b-4e8c-bf0d-2a3b4c5d6e7f",
      "title": "Avg Policy Eval Latency",
      "description": "Average duration of governance policy evaluation spans in milliseconds with trend comparison",
      "type": "STAT_CARD",
      "properties": {
        "value": "0.avg_latency_ms",
        "suffix": " ms",
        "color": "#2196F3",
        "trend": "0.rate",
        "trendSuffix": "%"
      },
      "config": {
        "query": "WITH\n    parseDateTimeBestEffort('{{filter.timeLimit.start}}') AS start_time,\n    parseDateTimeBestEffort('{{filter.timeLimit.end}}') AS end_time,\n    (end_time - start_time) AS duration,\n    (start_time - duration) AS prev_start_time,\n    (end_time - duration) AS prev_end_time\n\nSELECT\n    round(avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n        Timestamp >= start_time AND Timestamp <= end_time\n        AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms'])), 2) AS avg_latency_ms,\n    round(avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n        Timestamp >= prev_start_time AND Timestamp <= prev_end_time\n        AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms'])), 2) AS previous_avg_latency_ms,\n    round(if(\n        avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n            Timestamp >= prev_start_time AND Timestamp <= prev_end_time\n            AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms'])) = 0,\n        avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n            Timestamp >= start_time AND Timestamp <= end_time\n            AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms'])) * 100.0,\n        (avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n            Timestamp >= start_time AND Timestamp <= end_time\n            AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms'])) -\n         avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n            Timestamp >= prev_start_time AND Timestamp <= prev_end_time\n            AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms']))) /\n        avgIf(toFloat64OrZero(SpanAttributes['governance.policy.eval_latency_ms']),\n            Timestamp >= prev_start_time AND Timestamp <= prev_end_time\n            AND notEmpty(SpanAttributes['governance.policy.eval_latency_ms'])) * 100.0\n    ), 4) AS rate\nFROM otel_traces\nWHERE SpanName = 'governance.policy.evaluation'\n    AND Timestamp >= prev_start_time AND Timestamp <= end_time\n"
      }
    },
    "f6b8daec-5a7c-4f9d-c01e-3b4c5d6e7f8a": {
      "id": "f6b8daec-5a7c-4f9d-c01e-3b4c5d6e7f8a",
      "title": "Violations by Agent",
      "description": "Violations grouped by agent name to identify which agents trigger the most governance violations",
      "type": "BAR_CHART",
      "properties": {
        "xAxis": "agent_name",
        "yAxis": "violation_count",
        "color": "#FF9800"
      },
      "config": {
        "query": "WITH\n    parseDateTimeBestEffort('{{filter.timeLimit.start}}') AS start_time,\n    parseDateTimeBestEffort('{{filter.timeLimit.end}}') AS end_time\n\nSELECT\n    if(notEmpty(SpanAttributes['gen_ai.agent.name']), SpanAttributes['gen_ai.agent.name'], 'unknown') AS agent_name,\n    CAST(COUNT(*) AS INTEGER) AS violation_count\nFROM otel_traces\nWHERE SpanName = 'governance.violation'\n    AND Timestamp >= start_time AND Timestamp <= end_time\nGROUP BY agent_name\nORDER BY violation_count DESC\nLIMIT 10"
      }
    },
    "a7c9ebfd-6b8d-4a0e-d12f-4c5d6e7f8a9b": {
      "id": "a7c9ebfd-6b8d-4a0e-d12f-4c5d6e7f8a9b",
      "title": "Recent Violations",
      "description": "Audit table showing recent governance violations with agent, tool, policy, and verdict details",
      "type": "TABLE",
      "properties": {
        "color": "#F44336"
      },
      "config": {
        "query": "WITH\n    parseDateTimeBestEffort('{{filter.timeLimit.start}}') AS start_time,\n    parseDateTimeBestEffort('{{filter.timeLimit.end}}') AS end_time\n\nSELECT\n    formatDateTime(Timestamp, '%Y/%m/%d %R') AS time,\n    if(notEmpty(SpanAttributes['gen_ai.agent.name']), SpanAttributes['gen_ai.agent.name'], 'unknown') AS agent,\n    if(notEmpty(SpanAttributes['governance.tool']), SpanAttributes['governance.tool'], 'N/A') AS tool,\n    if(notEmpty(SpanAttributes['governance.policy']), SpanAttributes['governance.policy'], 'N/A') AS policy,\n    if(notEmpty(SpanAttributes['governance.verdict']), SpanAttributes['governance.verdict'], 'N/A') AS verdict\nFROM otel_traces\nWHERE SpanName = 'governance.violation'\n    AND Timestamp >= start_time AND Timestamp <= end_time\nORDER BY Timestamp DESC\nLIMIT 50"
      }
    }
  },
  "tags": "[\"governance\", \"security\"]",
  "layouts": {
    "lg": [
      { "i": "b2d4f6a8-1c3e-4a5b-8d7f-9e0a1b2c3d4e", "x": 0, "y": 0, "w": 1, "h": 1 },
      { "i": "c3e5a7b9-2d4f-4c6a-9e8b-0f1a2b3c4d5f", "x": 1, "y": 0, "w": 1, "h": 1 },
      { "i": "e5a7c9db-4f6b-4e8c-bf0d-2a3b4c5d6e7f", "x": 2, "y": 0, "w": 1, "h": 1 },
      { "i": "d4f6b8ca-3e5a-4d7b-ae9c-1f2a3b4c5d6e", "x": 0, "y": 1, "w": 4, "h": 2 },
      { "i": "f6b8daec-5a7c-4f9d-c01e-3b4c5d6e7f8a", "x": 0, "y": 3, "w": 2, "h": 2 },
      { "i": "a7c9ebfd-6b8d-4a0e-d12f-4c5d6e7f8a9b", "x": 2, "y": 3, "w": 2, "h": 2 }
    ]
  }
}

Resources

Agent Governance Toolkit

Policy enforcement, trust verification, audit trails, and capability governance for AI agents

OpenLIT Dashboards

Build custom dashboards for governance health metrics

​Overview

​Get Started

​Custom Dashboard: Governance Health

​Prometheus Metrics

​OpenLIT Dashboard

​Resources