We raised $28M. Here's what's next for us.
Where we're taking Coval: the loop that keeps your agent improving every week.
We just closed a $28M Series A to make voice AI deployment-ready. This round is fuel for a bigger idea, and we want to tell you where we’re taking Coval next.
Today, Coval is a measurement platform. You simulate thousands of conversations before launch, score every call in production, and sharpen your metrics with human review. That’s the foundation, not the destination.
Building a production agent is a continuous loop: create test cases, find issues, ship fixes, repeat. The volume of that work is simply too high to do by hand.
So we’re turning Coval from a tool you check into a workhorse that runs alongside you — one that automates the arc from “I have an agent” to “my agent is better than it was last week.”
The improvement loop
Every production agent rides the same cycle. Tap a stage to see what it does — or play the loop to watch it run, with metrics at the core.
The agent story
We’re building Coval agents that work the loop with you, in three places.
Setup: agents that create your evals
- Recommended evals. Coval recommends what evals to run, each citing its source — like a
successful_handoffmetric drawn from 14 human transfers. - Metrics from natural language. “Did the agent verify identity before discussing account details?” Coval drafts the rubric and runs it.
- Personas from real conversations. Generate and update personas from monitored conversations.
- Test cases from context. Build test cases from your prompts, documents, and workflows.
Insight: summarize & understand conversations at scale
- Discovery. Find conversations in plain language: “calls over 5 mins where the agent did not escalate on a fraud issue.”
- Occurrence rate. Quantify how often an issue actually happens, on the fly.
- Trends. Catch what’s going wrong that you weren’t tracking, before your customers do.
- Slack daily digest. A per-agent morning summary that reads like a narrative.
Optimize: close the loop with self-healing agents
- Coding agent for fixes. Proposes a prompt, tool, or guardrail change, runs it against your eval suite, and opens a PR with the evidence before you approve.
- Prompt optimization. A lighter path that tunes the prompt alone.
- Easier investigation. Mention
@covalin Slack, Linear, or a PR and it pulls the conversations, runs a hypothesis, and posts findings where you already work. - Review queues. Coval suggests the conversations worth reviewing, and your disagreements with a metric become proposed rubric edits.
The principles behind it
A bad agent suggests irrelevant noise. A magical one suggests exactly what you need, exactly when you need it. That’s hard to build, so we’re holding ourselves to a few rules.
- You stay in control. Every automated decision is inspectable, editable, and reversible. The agents move faster; you remain the arbiter of judgment.
- Evidence over slop. Every recommendation and flagged regression points to its source: which conversations, which signals, what changed, and why.
- Built for humans and agents alike. Every capability ships with an interface for people and an API for the coding agents your team already uses, like Claude Code and Codex.
- Every insight is actionable. We’re not building another evaluation platform. We’re building agent infrastructure that self-heals and helps your agent get better over time.
Ask us anything
We laid out where Coval is headed. Now we want your questions. I’m hosting a live AMA to get into the agentic roadmap, the principles behind it, and what self-healing agent infrastructure really takes.
July 16, 2026 at 11:00 am