How We Scaled a Laravel API to Handle Millions of Requests

Backend 2 min read

How We Scaled a Laravel API to Handle Millions of Requests

Mohab Bahlool

June 04, 2026

17 views

How We Scaled a Laravel API to Handle Millions of Requests

Scaling a Laravel API from a few thousand requests to millions per day is a journey every growing application must face. In this article, we walk through the exact strategies our team used to achieve 10x throughput without compromising response times.

Laravel API scaling architecture diagram

Identifying the Bottlenecks

Before optimizing, we profiled every layer of the stack. The main culprits were N+1 queries, unindexed database columns, and a Redis cache with a 30% hit rate. Using Laravel Debugbar and Telescope, we pinpointed the slowest endpoints.

// Before — N+1 queries
$orders = Order::where("status", "active")->get();
foreach ($orders as $order) {
    echo $order->user->name;
}

// After — eager loading
$orders = Order::with("user")
    ->where("status", "active")
    ->get();

Horizontal Scaling with Octane

Laravel Octane supercharged our application by keeping the framework booted in memory between requests. We deployed it with Swoole and saw a 5x reduction in response times under load.

"Octane alone reduced our P95 latency from 420ms to 85ms. Combined with RoadRunner, we handled 12,000 concurrent connections on a single server."

Cache Everything, But Cache Smart

We implemented a multi-tier caching strategy: in-memory for hot data, Redis for session and rate limiting, and database-level query result caching for expensive reporting queries.

Hot Cache: Frequently accessed user profiles and settings — TTL 60 minutes
Warm Cache: Aggregated statistics and counts — TTL 5 minutes
Cold Cache: Historical reports and logs — stored in dedicated cache tables

Queue Everything Async

Email notifications, PDF generation, webhook delivery — anything non-critical was moved to Laravel Horizon queues with Redis backend. This freed up PHP-FPM workers to handle real-time API traffic.

// Dispatching a job
ProcessPodcast::dispatch($podcast)
    ->onQueue("processing")
    ->delay(now()->addSeconds(10));

Database Optimization

We added composite indexes, implemented read replicas for reporting queries, and used DB::raw() for complex aggregations instead of looping in PHP.

Results

After these optimizations, our API consistently handles over 2 million requests per day with a 99.9% success rate and average response times under 50ms.

Tagged with :

Hello There!

Explore

Contact Us

How We Scaled a Laravel API to Handle Millions of Requests

Mohab Bahlool

Identifying the Bottlenecks

Horizontal Scaling with Octane

Cache Everything, But Cache Smart

Queue Everything Async

Database Optimization

Results

Recent Posts

Building Multi-Tenant SaaS Applications in Laravel

10 Common API Security Mistakes Developers Make

Microservices vs Monoliths: Lessons From Production

Hello There!

Explore

Contact Us

Follow Us