Latency 2.8s
Claude 3.5 Son.
1.3s
GPT-4o
4.1s
GPT-4o mini
3.1s
0s1.5s3s4.5s6s
Seconds
Escalation 73%
100% 50% 0% Claude 3.5 GPT-4o GPT-4o mini
Identity Verification 91%
Claude 3.5 Son.
88%
GPT-4o
95%
GPT-4o mini
90%
0%25%50%75%100%
Success %
Robotic Tone Detection 61%
Claude 3.5 Son.
45%
GPT-4o
72%
GPT-4o mini
65%
0%25%50%75%100%
Success %
Grouped by Agent Latency Escalation Identity Verification Robotic Tone Detection
Claude 3.5 Sonnet 1.3s 68% 88% 45%
GPT-4o 4.1s 80% 80% 87%
GPT-4o mini 3.1s 75% 90% 65%

Latency

Yes

Human Review

Explanation

Average end-to-end response latency stayed within the 3s target across the call, with no single turn exceeding the threshold.

Escalation

Yes

Human Review

Explanation

The agent correctly recognized when the request exceeded its scope and handed off to a human specialist, passing along full context so the customer did not have to repeat themselves.

Identity Verification

No

Human Review

Explanation

The agent proceeded with account-level requests without completing the required identity verification steps. Verification is mandatory before accessing or modifying any account information.

Robotic Tone Detection

No

Human Review

Explanation

Responses were detected as flat and robotic across several turns, lacking the natural prosody and pacing expected for this persona.

0:00