← Back to OFFO Labs Blog
Engineering

We Built a Hybrid AI System That Actually Helps People Buy Used EVs Without the Regret

OffoLab·April 2026·4 min read

How a deterministic-first scoring layer + hedged multi-LLM chain turned messy CarGurus listings and Copart auctions into actionable results.

When I started looking at used EVs seriously, the listings all looked reasonable on paper. Decent price, reasonable miles, clean Carfax. Then reality hit. A buyer would pull the trigger, only to discover six months later that the battery had been fast-charged heavily, the car lived in extreme cold, or the longest driving day in winter consistently ate 30–40% more range than advertised — and all of a sudden what was a good deal could lead to stress.

That gap between what the listing promises and what the car actually delivers in someone's real routine is what we set out to close. At the same time, salvage auctions (Copart, IAAI) present an even harder version of the same problem: you have minutes to decide whether a damaged vehicle is worth repairing and whether the math still works after the fix.

We needed a system that could answer both questions quickly, consistently, and in a way a normal buyer (or rebuilder) could trust.

What We Built

We ended up with a hybrid architecture that puts a deterministic rule engine first, then layers on an asynchronous multi-LLM chain only where it adds value.

Phase 1 — Lite Receipt

Under 500 ms

  • Deterministic engine extracts signals from any CarGurus listing
  • Scores risk and confidence against V2 thresholds
  • Hard blockers → RED; low confidence → YELLOW; clean → GREEN
  • Always returns a real result in < 500ms
Phase 2 — AI Upgrade

Async, 10–30 seconds

  • OpenAI primary at T+0s
  • Gemini at T+8s
  • Grok at T+14s
  • First valid JSON wins. Rest aborted.

Missing battery report, service records, or VIN confirmation alone pushes most listings to YELLOW — exactly the friction buyers feel later. The same core feeds both used-EV routine fit scoring and Copart arbitrage analysis.

The Scoring That Actually Matters

The heart of the system is RoutineFitScore — six weighted dimensions that directly map to daily ownership friction:

Charging Stress

Range Buffer

Budget Fit

Recovery Resilience

Climate Friction

Utility Fit

  • → Sigmoid curve for range buffer (eliminates scoring cliffs)
  • → Compound interaction multipliers: winter + street parking + public charging = ×0.91
  • → Explicit uncertainty penalties when data is missing

The final score translates directly into mental-load labels: Great Fit (low load), Mixed Fit (medium), High Friction (high). For Copart lots we add a probabilistic salvage model that outputs ARV, repair cost range, recommended bid ceiling, and margin scenarios.

What the Numbers Show So Far

Since March 2026 the system has processed 118 receipt checks and 167 Copart auction analyses.

< 500ms

Lite receipt latency

91%

Full AI upgrade success rate

after Node.js fix in April

36.4%

Save rate on full results

These aren't vanity metrics. They show the deterministic-first design keeps the system reliable and fast while the LLM layer only augments where it's safe. The 36.4% save rate is an early signal that the outputs are useful enough for buyers to return to.

What This Means for Buyers Right Now

The used-EV market is finally offering real choice at lower prices. The difference between a great ownership experience and quiet regret often comes down to the exact same signals our system surfaces early: battery history gaps, winter buffer realism, and charging access that actually matches your routine.

We built this because we got tired of watching thoughtful buyers make expensive decisions with incomplete information. The goal was never to replace human judgment — but to give people better information so their judgment could be sharper.

The system is live at offolab.com. If you're shopping a used EV or evaluating a salvage lot right now, I'd genuinely value your feedback on how the analysis matches (or doesn't match) what you're seeing in the real world.

What's the one detail that still feels hardest to evaluate when you're looking at a used EV listing?

We read every comment.