Ship AI with Laravel: from demo to production — failover, queues, middleware
The gap between a working AI demo and a system that survives a 2am provider outage is three primitives — and Laravel now ships all of them.
Harris Raftopoulos walked through the production layer of the Laravel AI SDK: provider failover (pass an array — if OpenAI fails it retries Anthropic, then Gemini — moved to config), queued execution (the call runs in the background, the user gets instant confirmation), and three middleware: logging, per-user rate limiting (10/min), and cost tracking per user/agent/day, with rate-limiting ordered before logging so rejected calls don’t burn work.
This is the part most "build an AI agent" tutorials skip — and the part that decides whether your feature is a toy or a product. Failover, backpressure, and observability used to be bespoke glue every team rewrote badly. As first-class, config-driven primitives, a Laravel dev’s existing mental model (queues, events, middleware) is the AI production stack.
I run this for real across Kai, Sol and HelpMatch, and the article is honest but optimistic. What it doesn’t tell you: the three providers don’t fail politely one at a time — a bad prompt or a region outage can 500 OpenAI and Anthropic together, and your "automatic retry" just triples latency before failing anyway. Set aggressive per-provider timeouts. And queued AI isn’t free idempotency: if the job retries after a partial LLM call you double-charge yourself. Make the call idempotent on a request key before you trust the queue.
EC TV is written by Eduardo Cruz — a senior Laravel engineer who ships production AI agents and MCP servers.
Work with me → Read the deep-dives