Avaamo × Cloudflare
A primitive-by-primitive mapping · June 2026
The Load-Bearing Edge

Avaamo cracked the last mile of enterprise AI.
Cloudflare is the load-bearing edge underneath it.

Agent Studio, Orchestrator, Trust AI, Knowledge AI, Voice Gateway, A2A Communication — every layer of the Avaamo agentic platform has a 1:1 mapping to a Cloudflare developer primitive. This page walks each one, then ends with what changes Monday.

"There's a dirty secret in enterprise AI that nobody wants to talk about: most deployments fail not because the technology doesn't work, but because nobody can actually get it to work where it matters." Ram Menon, CEO of Avaamo — "Why enterprise AI dies in the last mile"

Avaamo's pipeline, primitive-by-primitive on Cloudflare

The platform overview at avaamo.ai/agentic-platform shows your own architecture diagram: Agent Studio → Orchestrator → Trust AI → Prompt Library → Knowledge AI → Integrations. Here is the same pipeline, with the Cloudflare primitive that satisfies each layer's runtime requirements.

Agent Studio Orchestrator Trust AI Prompt Library Knowledge AI Integrations Voice Gateway A2A Communication
Avaamo layer
What it requires at runtime
Cloudflare primitive
Agent StudioLow/no-code build surface
Branch-per-tenant preview environments, instant rollback, version pinning per customer
PagesWorkers Preview URLs per branch; atomic deploys; instant rollback to any prior build hash.
OrchestratorPatented multi-agent coordination
Durable state per conversation, strict ordering, single-writer semantics across regions
Durable ObjectsQueues One DO per active orchestration; Queues for fan-out to sub-agents with at-least-once delivery.
Trust AINative Trust Layer for guardrails
Inline policy enforcement before model call: PII redaction, prompt-injection defense, per-tenant cost ceiling
AI GatewayWorkers Logged, cached, rate-limited, fallback-routed LLM calls — with spend limits and identity-bound budgets (GA 2026-06-05).
Prompt LibraryEnglish-command pre-built prompts
Versioned prompt storage close to inference, low-read-latency, immutable for audit
Workers KVR2 KV for hot prompt reads at every POP; R2 for the versioned artifact store with zero egress.
Knowledge AI / LLaMB™Enterprise retrieval at scale
Vector search with per-tenant isolation, real-time index sync from structured + unstructured sources
VectorizeR2D1 Tenant-scoped vector indexes co-located with the inference Worker; R2 for source corpora.
IntegrationsModern SaaS + legacy systems
Outbound connectivity into customer estates without opening inbound firewall holes; mTLS to partners
Cloudflare TunnelAccessWorkers Zero-Trust tunnels into on-prem legacy; Access policies for partner identity; mTLS at the edge.
Voice AI / Voice GatewayVoice-native agentic platform
Sub-150ms turn latency globally, regional data residency, carrier-grade DDoS resilience
WorkersRealtimeRegional Services Anycast turn handling at 330+ POPs; SFU for audio mixing; region-pinned data paths for compliance.
A2A CommunicationAgent-to-agent transport
Reliable message passing between agents in different tenants, regions, or sub-orgs
QueuesDurable ObjectsWorkers Typed message contracts; ordered fan-out; cross-region replication via DO migration.
Live Agent for MyChartHuman handoff inside Epic
PHI data path control, BAA-covered logging, audit trail, sub-second handoff latency
Regional ServicesLogpushTunnel US-only data path; Logpush to customer-owned R2/S3 for HIPAA audit; private tunnel into hospital network.

UCHealth, MyChart, Gigi: the wedge nobody else can match

Voice is the layer where every other infra choice is forgiven or punished. The page at avaamo.ai/care-companion says "Your AI-powered guide to smarter, simpler healthcare right from your website or MyChart." That sentence has three load-bearing requirements that Cloudflare uniquely solves at the edge.

Turn-by-turn latency, globally

A voice agent must respond inside the natural conversational gap — roughly 200ms human-perceptible, with 80-120ms budget for the network alone. From a single us-west-2 origin, every patient outside the western US starts the conversation behind.

Cloudflare Workers + Realtime place the first hop inside the patient's metro. The Avaamo Orchestrator only sees clean audio frames; the long-haul disappears.

HIPAA data path, not just paper

SOC 2 Type II, ISO 27001, NIST 800-171, HIPAA — the certifications are there. The data path still has to honor them on every call. PHI in a voice frame routing through a non-BAA region is a control failure regardless of the audit binder.

Regional Services guarantees TLS termination and processing in customer-selected jurisdictions. Logpush writes audit telemetry directly into customer-owned R2 or S3 — chain-of-custody intact.

Epic / MyChart integration without firewall holes

Hospital networks don't open inbound. Cloudflare Tunnel reverses the connection model — the customer's on-prem connector dials out to Cloudflare; Avaamo agents reach Epic FHIR endpoints through the tunnel; no public surface.

Access policies bind every tunneled call to an identity (provider, agent, system) with full audit on the same surface as the rest of the deployment.

DDoS as a Tier-1 patient-safety control

When Gigi answers patient questions from a hospital homepage, an inbound L7 flood becomes a clinical event, not an infra event. The current footprint (Apache + EC2 + a single CloudFront distribution) absorbs DDoS at origin compute cost.

Cloudflare absorbs Tbps-class L3/L4 and L7 attacks before they consume an EC2 cycle — at the same anycast that serves the legitimate voice turn.

The Trust Layer is a marketing surface until it lives at the gateway

Avaamo's positioning around Trust AI — guardrails, hallucination prevention, security and compliance without compromising performance — is the single most important promise the platform makes. AI Gateway is the runtime where that promise becomes enforceable rather than aspirational.

GA · 2026-06-05

AI Gateway: spend limits and identity-bound budgets, shipped today

Cloudflare announced general availability of spend limits and identity-driven budgets in AI Gateway today (June 5, 2026). Every LLM call from every Avaamo agent — across LLaMB™, Anthropic, OpenAI, Google, or self-hosted — can now route through a single inspection plane that enforces per-tenant, per-user, per-agent cost ceilings before the upstream call leaves Cloudflare's network.

Combined with caching, fallback routing, prompt logging, PII redaction, and per-route rate limits, this is the missing runtime for Trust AI. The same gateway gives Avaamo's customers a single audit log — one place to answer the regulator's question about which model saw which data when.

What Cloudflare adds in front of avaamo.ai without changing a line of code

Before any of the platform mapping above, the simplest first step is the public ingress. Orange-clouding avaamo.ai, app.avaamo.ai, and the API surface immediately adds the following without origin changes.

WAF managed rules + custom logic OWASP Top 10, prompt-injection signatures, per-route policies. Block-at-edge for known-bad before traffic touches the WordPress + Apache front end.
Bot Management Distinguishes legitimate browser sessions, partner integrations, and known-good crawlers from credential-stuffing, scraping, and form-abuse bots hitting /experience-a-demo/ and contact endpoints.
DDoS L3/L4/L7 Tbps-scale absorption at anycast. No origin scaling, no scrubbing center fail-over, no cost-of-attack negotiations.
Page Shield Continuous monitoring of third-party scripts (analytics, retargeting, embedded widgets). Alerts when a dependency starts exfiltrating or changes integrity hash.
Rate limiting on identity-creating endpoints Form submissions, demo requests, and trial activations get per-IP and per-fingerprint limits without application changes.
mTLS at the edge for partner integrations Wipro, VW, Penske, Siemens — each large enterprise integration can enforce mTLS termination at Cloudflare with no certificate handling inside the application tier.
Argo Smart Routing Cross-region traffic routed over Cloudflare's backbone instead of the public internet. Tail-latency improvements measurable per POP, per customer.
Single observability surface Logpush from Cloudflare, AI Gateway, Workers, and R2 into one customer-owned destination. One audit story for SOC 2, HIPAA, GDPR, NIST 800-171 across the entire data path.

What changes Monday, week by week

No rip-and-replace. Every phase is reversible by toggling a DNS record or disabling a route. The goal is to land each value claim with a benchmark Avaamo's own engineering team owns.

01
Week 1

Orange-cloud the front door

Cloudflare in front of avaamo.ai and the marketing surface. WAF, DDoS, Bot Management, Page Shield active with zero origin change. Logpush wired.

02
Week 2–4

AI Gateway in front of LLaMB™

Route one Avaamo agent's LLM calls through AI Gateway. Caching, fallback, PII redaction, spend caps. Benchmark cost-per-conversation against current path.

03
Week 5–8

Voice turn handling on Workers

One Voice Gateway tenant moves first-hop turn handling to Workers + Realtime. Measure p50/p95 latency improvement by geography.

04
Week 9–12

Vectorize pilot for Knowledge AI

Stand up one Knowledge AI tenant on Vectorize with R2 corpus. Compare retrieval quality and per-query cost against current vector store.

The fastest-moving AI companies chose the load-bearing edge first

The reason isn't brand. The reason is that selling Trust requires the infrastructure underneath the Trust Layer to be load-bearing in its own right. Anthropic, ElevenLabs, and Perplexity all made the same call before they had to.

Anthropic

Workers + AI Gateway in the request path of model-serving workloads. Public CDN partner for Claude documentation and developer surfaces.

ElevenLabs

Voice generation at conversational latencies. The same anycast network Cloudflare runs is the one their voice-native product depends on.

Perplexity

Workers-resident routing and AI Gateway for upstream model fan-out. Single observability surface across the entire inference path.

The pattern is consistent: companies whose product is the AI experience — not a SaaS app that happens to call a model — choose Cloudflare for the layer underneath. Avaamo's positioning around the last mile, the Trust Layer, and voice-native agents puts the platform squarely in that category.