> ## Documentation Index > Fetch the complete documentation index at: https://docs.openlit.io/llms.txt > Use this file to discover all available pages before exploring further. # Programmatic Evaluations > Quickly evaluate your LLMs and AI Agent responses for Hallucination, Bias, and Toxicity This guide demonstrates how to implement LLM evaluation tools to assess model output quality. With OpenLIT's programmatic evaluations, you can perform LLM hallucination detection, bias detection, and toxicity filtering using production-ready evaluation metrics. Learn how to use our `All` evaluator for comprehensive LLM output assessment, measuring hallucination, bias, and toxicity simultaneously. We'll also show you how to collect OpenTelemetry evaluation metrics for continuous model performance monitoring. Set up evaluations for large language models with just two lines of code: ```python theme={null} import openlit # Comprehensive LLM evaluation: hallucination detection, bias detection, toxicity filtering evals = openlit.evals.All() result = evals.measure() ``` Full Example: ```python example.py theme={null} import openlit # openlit can also read the OPENAI_API_KEY variable directy from env if not specified via function argument openai_api_key=os.getenv("OPENAI_API_KEY") # Production-ready LLM evaluation tools for hallucination detection and bias screening evals = openlit.evals.All(provider="openai", api_key=openai_api_key) contexts = ["Einstein won the Nobel Prize for his discovery of the photoelectric effect in 1921"] prompt = "When and why did Einstein win the Nobel Prize?" text = "Einstein won the Nobel Prize in 1969 for his discovery of the photoelectric effect" result = evals.measure(prompt=prompt, contexts=contexts, text=text) ``` ```sh Output theme={null} verdict='yes' evaluation='Hallucination' score=0.9 classification='factual_inaccuracy' explanation='The text incorrectly states that Einstein won the Nobel Prize in 1969, while the context specifies that he won it in 1921 for his discovery of the photoelectric effect, leading to a significant factual inconsistency.' ``` ```typescript theme={null} import openlit from "openlit" // Comprehensive LLM evaluation: hallucination detection, bias detection, toxicity filtering const evals = new openlit.evals.All() const result = await evals.measure() ``` Full Example: ```typescript theme={null} import openlit from "openlit" // Production-ready LLM evaluation tools for hallucination detection and bias screening const evals = new openlit.evals.All({ provider: "openai", apiKey: process.env.OPENAI_API_KEY, }) const contexts = ["Einstein won the Nobel Prize for his discovery of the photoelectric effect in 1921"]; const prompt = "When and why did Einstein win the Nobel Prize?"; const text = "Einstein won the Nobel Prize in 1969 for his discovery of the photoelectric effect"; const result = await evals.measure({ prompt, contexts, text }); console.log(result) ``` The `All` evaluator assesses model outputs for hallucination detection, bias detection, and toxicity filtering simultaneously. For targeted model evaluation, use specific evaluators: Detect factual inaccuracies and false information in LLM responses Identify potential biases and unfair representations in AI outputs Screen content for harmful, offensive, or inappropriate material For advanced LLM evaluation metrics and supported providers, explore our [Evaluations Guide](/latest/sdk/features/evaluations). To send evaluation scores to OpenTelemetry backends, your application needs to be instrumented via OpenLIT. Choose from three instrumentation methods, then simply add `collect_metrics=True` to track hallucination detection, bias screening, and toxicity filtering metrics. No code changes needed - instrument via CLI: ```bash theme={null} # Run with zero-code instrumentation openlit-instrument python your_app.py ``` Then in your application: ```python theme={null} import openlit # Enable evaluation metrics tracking - OpenLIT instrumentation handles the rest evals = openlit.evals.All(collect_metrics=True) result = evals.measure(prompt=prompt, contexts=contexts, text=text) ``` Add OpenLIT initialization to your application: ```python theme={null} import openlit # Initialize OpenLIT for LLM evaluation metrics collection openlit.init() # Enable evaluation metric tracking for hallucination detection and bias screening evals = openlit.evals.All(collect_metrics=True) result = evals.measure(prompt=prompt, contexts=contexts, text=text) ``` TypeScript example: ```typescript theme={null} import openlit from "openlit" // Initialize OpenLIT instrumentation openlit.init() // Automatic LLM evaluation metrics collection const evals = new openlit.evals.All({ collectMetrics: true }); const result = await evals.measure({ prompt, contexts, text }); ``` Metrics are sent to the same OpenTelemetry backend conifgured during instrumentation, check our [support destinations](/latest/sdk/destinations/overview) for configuration details. You're all set! Your AI applications now have complete model evaluation capabilities with automated hallucination detection, bias screening, and toxicity filtering. Monitor LLM output quality with real-time evaluation metrics. If you have any questions or need support, reach out to our [community](https://join.slack.com/t/openlit/shared_invite/zt-2etnfttwg-TjP_7BZXfYg84oAukY8QRQ). *** Automatically add evaluation scoring to production traces 60+ AI integrations with automatic instrumentation and performance tracking Send elemetry to Datadog, Grafana, New Relic, and other observability stacks Discover and instrument LLM traffic across Kubernetes, Docker, and Linux using eBPF — no code changes required.