When someone pastes a $42,000 used EV listing into OFFO at 11pm on a Sunday, they want a verdict in under two seconds — not a spinner for 30 seconds while three AI models deliberate. But they also want the analysis to be right.
These two goals appear to be in tension. They’re not, if you sequence the pipeline correctly.
This post describes how OFFO’s deal intelligence pipeline works: a deterministic rule engine that produces a full, meaningful receipt in under 500ms, followed by an asynchronous three-model AI chain that upgrades it in the background. The user never sees a loading state for the AI.
The two-layer architecture
Every OFFO receipt is generated twice. The first generation is instant and deterministic. The second is async and AI-powered.
- ~150 deterministic rule patterns
- Negation-aware regex extraction
- Hard blocker detection
- Fit score + evidence score
- Returns in < 500ms
- Grok: damage classification + tone
- Gemini: routine impact + owner translation
- GPT-4o: repair cost breakdown (skipped when not needed)
- Max 3 model calls enforced
- 15-minute background budget
The key insight: the deterministic layer is not a placeholder. It produces a real, actionable verdict based on signals that AI frequently gets wrong anyway — title status, accident history, service records. These facts don’t require inference. They require pattern matching, and pattern matching is fast.
Receipt pipeline
Here’s the full receipt pipeline from POST to response, including the background AI upgrade path. The client gets Layer 1 synchronously; Layer 2 streams back via polling on generation_status.
Signal extraction: negation-aware pattern matching
The most important design decision in the rule engine is negation awareness. A listing that says “no frame damage” is categorically different from one that says “frame damage.” Early versions of GPT-based extractors failed this test regularly.
Each signal pattern carries both a positive match and an optional negation override:
// lib/receipt-signal-extractor.ts
const HARD_BLOCKER_PATTERNS: SignalPattern[] = [
{
signalId: "frame_damage_major",
positive: [/\b(frame\s+damage|structural\s+damage|unibody\s+damage)\b/],
negation: [/\b(no\s+frame\s+damage|no\s+structural\s+damage)\b/],
},
];
// Negation is checked first — if the negation regex matches,
// the positive match is discarded even if it also fires.This matters for listings that use softening language: “fully repaired flood vehicle” still triggers title_salvage via structured field extraction, even if the listing text tries to minimize it.
A rebuilt-title flood vehicle with partial service history. The rule engine immediately triggers the title_salvage hard blocker and collapses fit score, even though the listing text uses softening language like 'fully repaired'.
- · title_salvage
- · ownership_history_clear
- · battery_proof_missing
- · battery_warranty_unclear
- · service_records_missing
- · dcfc_unclear
- + 4 more
{
"fit_score": 100,
"evidence_score": 0,
"verdict": "RED",
"evidence_label": "MISSING",
"scoring_reasons": [
{
"signal_id": "title_salvage",
"category": "listing_risk",
"points": 0,
"label": "Salvage, rebuilt, flood, or lemon title detected"
},
{
"signal_id": "ownership_history_clear",
"category": "listing_risk",
"points": 5,
"label": "Ownership history shown"
},
{
"signal_id": "battery_proof_missing",
"category": "missing_proof",
"points": -15,
"label": "No battery health proof provided"
},
{
"signal_id": "battery_warranty_unclear",
"category": "missing_proof",
"points": -6,
"label": "Battery warranty status not shown"
},
{
"signal_id": "service_records_missing",
"category": "missing_proof",
"points": -8,
"label": "No service history shown"
},
{
"signal_id": "dcfc_unclear",
"category": "missing_proof",
"points": -10,
"label": "DC fast charging support not confirmed"
},
{
"signal_id": "fees_unclear",
"category": "missing_proof",
"points": -8,
"label": "Out-the-door fees or add-ons not clear"
},
{
"signal_id": "tire_condition_unclear",
"category": "missing_proof",
"points": -4,
"label": "Tire condition not visible or mentioned"
},
{
"signal_id": "vin_missing",
"category": "missing_proof",
"points": -6,
"label": "VIN not provided"
},
{
"signal_id": "structural_claim_no_photo",
"category": "listing_risk",
"points": -15,
"label": "Structural damage claimed but no frame/underbody photo"
}
],
"why_not_green": [
{
"signal_id": "battery_proof_missing",
"category": "missing_proof",
"points": -15,
"label": "No battery health proof provided"
},
{
"signal_id": "structural_claim_no_photo",
"category": "listing_risk",
"points": -15,
"label": "Structural damage claimed but no frame/underbody photo"
},
{
"signal_id": "dcfc_unclear",
"category": "missing_proof",
"points": -10,
"label": "DC fast charging support not confirmed"
},
{
"signal_id": "service_records_missing",
"category": "missing_proof",
"points": -8,
"label": "No service history shown"
},
{
"signal_id": "fees_unclear",
"category": "missing_proof",
"points": -8,
"label": "Out-the-door fees or add-ons not clear"
},
{
"signal_id": "battery_warranty_unclear",
"category": "missing_proof",
"points": -6,
"label": "Battery warranty status not shown"
},
{
"signal_id": "vin_missing",
"category": "missing_proof",
"points": -6,
"label": "VIN not provided"
},
{
"signal_id": "tire_condition_unclear",
"category": "missing_proof",
"points": -4,
"label": "Tire condition not visible or mentioned"
},
{
"signal_id": "title_salvage",
"category": "listing_risk",
"points": 0,
"label": "Salvage, rebuilt, flood, or lemon title detected"
}
],
"hard_blocker_hit": true,
"verify_before_visit": [
"No battery health proof provided",
"Structural damage claimed but no frame/underbody photo",
"DC fast charging support not confirmed",
"No service history shown",
"Out-the-door fees or add-ons not clear"
]
}The auction AI chain: three models, max three calls
The auction pipeline is more AI-intensive than the receipt pipeline because auction lots don’t have structured fields — we’re working from raw listing text, copart photos, and VIN lookups. But we still enforce a hard cap: maximum three meaningful model calls per request.
The routing logic is encoded in canSkipRepairCost():
// lib/auction/auction-ai-chain.ts
function canSkipRepairCost(
metrics: DeterministicMetrics,
arv: number | null,
isPaid: boolean
): boolean {
// Only skip when damage is confirmed minor AND we have real damage data
// (not a data-absent default) — source_confidence: "low" means skip is unsafe
if (
metrics.damage_severity_baseline === "minor" &&
arv !== null &&
metrics.source_confidence !== "low"
) return true;
return false;
}When the skip fires, GPT-4o is never called. Gemini handles routine impact in parallel with a skipped repair cost slot, and Grok runs the final polish. Total: 2 calls. When the skip doesn’t fire (severe damage, unknown ARV, low source confidence), all three stages run — GPT-4o and Gemini run in parallel at step 2, Grok finalizes at step 3.
Each step is logged as an AiStepLog:
export interface AiStepLog {
step: string;
model: string;
status: "success" | "failed" | "skipped" | "cancelled";
latency_ms: number;
error?: string;
}Routine fit: six dimensions, one sigmoid
The routine fit engine is the most mathematically interesting part of OFFO. It converts a user’s daily driving pattern into a 0–100 score across six weighted dimensions. No AI is involved — it’s a pure function.
The range buffer dimension (25% weight) uses a logistic sigmoid instead of piecewise linear buckets. This eliminates the cliff problem where a user at 59% range usage scores dramatically differently from one at 61%:
// lib/compute-routine-fit.ts — Phase 4A: sigmoid range buffer
// Centered at 62% usage. Approximate values:
// 0% → 100 30% → ~96 50% → ~82 60% → ~68
// 70% → ~50 80% → ~31 90% → ~17 100% → ~8
function rangeScoreFromUsagePct(usagePct: number): number {
const pct = Math.max(0, Math.min(100, usagePct));
// Logistic sigmoid: 100 / (1 + e^(k*(pct - center)))
const raw = 100 / (1 + Math.exp(0.085 * (pct - 62)));
return Math.max(5, Math.round(raw));
}Cross-dimension multipliers apply after individual scoring. The catastrophic failure zone collapses the final score when routine is near-unworkable:
// Catastrophic zone: usage > 90% OR (public charging + poor density)
// × 0.85 collapse multiplier applied to weighted sum
// Compound checks (non-catastrophic):
// × 0.91 — low charging + cold climate street parking
// × 0.93 — public charging + high mileage
// × 0.95 — shared charger + long commuteA suburban commuter with garage L2 charging in a mild climate. Usage runs ~12% of range per day. The sigmoid range buffer gives a near-perfect score and the budget dimension clears comfortably.
Why not AI-first?
The original OFFO prototype was AI-first. A single GPT-4 call received the listing text and returned a structured receipt. It was slow (8–18 seconds), expensive ($0.04–0.12 per call), and wrong in predictable ways.
“Salvage title” in a listing is not ambiguous. It doesn’t require reasoning — it requires reading. A regex finds it in 0.1ms. GPT-4 occasionally missed it when buried in dense listing text.
“No frame damage” would sometimes be parsed as frame damage being mentioned. The rule engine checks the negation pattern first and short-circuits — eliminating this class of error entirely.
A single model timeout left the user with an error page. The deterministic layer means users always get a real receipt — AI upgrades it when available, but the receipt exists immediately regardless.
The deterministic layer handles the 80% of the signal space that doesn’t require inference. The AI chain handles the 20% that does: nuanced damage tone, routine lifestyle translation, repair cost breakdown when ARV is unknown.
Observability: opt-in pipeline tracing
To instrument the pipeline for debugging without affecting production latency, we added an opt-in trace system. Pass debug_trace: true in the request body and the pipeline captures timings and step logs, then persists them to pipeline_traces via a fire-and-forget write.
// lib/debug-trace.ts
export interface PipelineTrace {
trace_id: string;
pipeline: "receipt" | "auction" | "routine";
created_at: string;
total_latency_ms: number;
steps: PipelineStep[];
timings: Record<string, number>;
meta: Record<string, unknown>;
}
// In route.ts — zero cost when disabled
const debugEnabled = body.debug_trace === true;
const trace = debugEnabled ? createTrace("receipt") : null;
// Before return:
if (trace) {
finalizeTrace(trace, { verdict, fit_score, signal_count });
persistTrace(trace); // fire-and-forget — never blocks response
}Traces are retrievable via GET /api/admin/trace/:traceId (admin key required). This lets us replay a user’s exact pipeline run for debugging without any live-traffic impact.
What’s next
The two-layer pattern has proven stable. The areas we’re actively improving:
- →ARV resolution pipeline: The four-phase ARV chain (listing comparables → VIN decode → depreciation curve → AI fallback) is the highest-variance component. Improving P1 and P2 hit rates lets us skip GPT-4o more often.
- →Rule pattern expansion: The ~150-pattern signal set was built from analyzing real listing text. We continue adding patterns as we see new seller language patterns in production data.
- →Routine fit calibration: The sigmoid center (62%) and multiplier values (0.85, 0.91, 0.93, 0.95) were set analytically. We’re building a feedback loop to calibrate them against real EV owner satisfaction data.
- →Trace-driven content: The scenario generator in
lib/content/scenario-generator.tscalls real functions and embeds live outputs into blog posts at build time — which means this post updates automatically when the scoring engine changes.
The core principle behind all of this: be deterministic wherever determinism is possible. Reserve inference for the genuinely ambiguous. Move the inference off the critical path. The result is a system that is faster, cheaper, and more predictable than an AI-first approach — and more reliable when models have outages or rate limits.