Waymo's new model benchmarks robotaxi performance against human drivers

In a January incident, Waymo's previous simulation model claimed an attentive human driver would have impacted a child at 14 mph, while its robotaxi hit the child at 6 mph.

SR
Sofia Reyes

June 10, 2026 · 3 min read

Waymo robotaxi on a city street with a holographic display comparing its performance to a human driver.

In a January incident, Waymo's previous simulation model claimed an attentive human driver would have impacted a child at 14 mph, while its robotaxi hit the child at 6 mph. The stark contrast in outcomes (14 mph vs 6 mph) reveals the critical difference when autonomous systems confront sudden, real-world hazards, demanding sharper benchmarks for robotaxi performance.

However, Waymo is releasing an open-source model for transparency and benchmarking. The inherent complexity of simulating human behavior, combined with past internal claims, raises questions about its definitive impact on public trust.

Based on Waymo's push for a standardized, open-source human behavior model, the autonomous vehicle industry appears likely to gain a new tool for demonstrating safety, potentially accelerating public acceptance and regulatory pathways, provided the model withstands independent validation.

Introducing the Reference Driver: A New Benchmark for Robotaxi Safety

Waymo has developed the Reference Driver (ReD), a new computer model designed to benchmark robotaxi performance against human drivers, according to TechCrunch. This framework, introduced with TU Delft, specifically simulates human decision-making in critical pre-collision scenarios, focusing on split-second crash avoidance, according to Mezha.

Crucially, Waymo is making the ReD model open source and publicly available, according to The Verge. Waymo's decision to make the ReD model open source and publicly available shifts the industry from internal, potentially biased safety claims to a transparent, externally verifiable approach. The implication is clear: true public trust demands more than self-reported statistics; it requires a shared, auditable standard for safety validation, potentially setting a new precedent for regulatory acceptance.

Benchmarking Against Human Performance: Incidents and Overall Safety Records

The January incident, where Waymo's previous model suggested a human driver would have hit a child at 14 mph compared to the robotaxi's 6 mph impact, according to TechCrunch, starkly illustrates the challenge. This single event, while simulated, reveals the difficulty in definitively proving safety in complex, real-world scenarios, where public perception often outweighs statistical averages.

Broader data offers a different perspective. Waymo robotaxis average 2.1 police-reported crashes per million miles, significantly lower than the 4.68 per million miles for human drivers, according to Crypto Briefing. Waymo's internal data further supports this, reporting human drivers average 4.85 police-reported crashes per million miles, indicating a 57% reduction for the Waymo Driver, according to Waymo. These figures, despite a slight discrepancy between external and internal reports, consistently position robotaxis as statistically safer by traditional crash metrics. However, the true test lies not just in reducing overall incidents, but in demonstrating superior performance in the most challenging, unexpected situations that erode public trust.

The operational environment also matters. San Francisco, for instance, recorded 5.55 injury-reported incidents per million miles, roughly three times higher than the national average, according to Waymo. San Francisco's elevated baseline of 5.55 injury-reported incidents per million miles in dense urban settings means robotaxis must not only outperform human drivers but also navigate a higher-risk landscape. The industry's ongoing challenge is to translate this statistical superiority into undeniable public confidence, especially when specific, high-profile pre-collision events dominate headlines.

If the Reference Driver model withstands rigorous independent validation, it appears likely to standardize safety benchmarks across the autonomous vehicle industry, potentially accelerating regulatory approval and public acceptance for robotaxis.