Introduction
While traditional software tests have clear pass/fail conditions, AI outputs are non-deterministic — they can vary with the same input. Scorers help bridge this gap by providing quantifiable metrics for measuring agent quality. Scorers are automated tests that evaluate Agent outputs using model-graded, rule-based, and statistical methods. Scorers return scores: numerical values (typically between 0 and 1) that quantify how well an output meets your evaluation criteria. These scores enable you to objectively track performance, compare different approaches, and identify areas for improvement in your AI systems. Scorers can be customized with your own prompts and scoring functions. Scorers can be run in the cloud, capturing real-time results. But scorers can also be part of your CI/CD pipeline, allowing you to test and monitor your agents over time.Types of Scorers
There are different kinds of scorers, each serving a specific purpose. Here are some common types:- Textual Scorers: Evaluate accuracy, reliability, and context understanding of agent responses
- Classification Scorers: Measure accuracy in categorizing data based on predefined categories
- Prompt Engineering Scorers: Explore impact of different instructions and input formats
Installation
To access Mastra’s scorers feature, install the@mastra/evals package:
Live Evaluations
Live evaluations allow you to automatically score AI outputs in real-time as your agents and workflows operate. Instead of running evaluations manually or in batches, scorers run asynchronously alongside your AI systems, providing continuous quality monitoring.Adding Scorers to Agents
You can add built-in scorers to your agents to automatically evaluate their outputs. See the full list of built-in scorers for all available options.src/mastra/agents/evaluated-agent.ts
Adding Scorers to Workflow Steps
You can also add scorers to individual workflow steps to evaluate outputs at specific points in your process:src/mastra/workflows/content-generation.ts
How Live Evaluations Work
When you add scorers to agents or workflow steps, they automatically run in the background as your AI systems execute. The scoring happens asynchronously, so it doesn’t block or slow down your agent responses. Results are captured and stored for analysis, allowing you to monitor quality trends over time.Next Steps
- Learn about built-in scorers available in Mastra
- Create custom scorers for your specific use cases
- Set up CI/CD integration for automated testing

