Understanding AWS Lambda Concurrency: Cold Starts, Warm Starts, and Container Reuse

2025-08-25

When discussing AWS Lambda, a common point of confusion revolves around how concurrency is managed and what exactly happens when your function scales. Let's clarify the mechanics of Lambda concurrency, focusing on container reuse, new container creation, and the concept of "cold starts" versus "warm starts."

Lambda Doesn't "Create New Lambdas" for Concurrency

It's a common misconception that each concurrent execution of a Lambda function results in the creation of a "new Lambda." This isn't accurate. Instead, Lambda manages containers – isolated runtime environments where your function code executes.

What Actually Happens:

1. Container Reuse (Warm Starts)

AWS Lambda prioritizes reusing existing containers whenever possible. If a container has just finished processing a request, it remains "warm" for a period (typically around 15 minutes). When a new request arrives for the same function, and a warm container is available, Lambda can immediately route the request to that container.

2. New Container Creation (Cold Starts)

Lambda only creates a new container when:

When a new container is created, this leads to a "cold start."

3. The Process Illustrated:

Consider a scenario with sequential and concurrent requests:

BUT, if Request 2 comes in while Request 1 is still running:

Key Points:

Example:

If you have 1 Lambda function deployed and receive 50 concurrent requests, the result will be up to 50 containers running the same function code simultaneously. After the traffic subsides, most of these containers will be terminated, with perhaps 1-2 kept warm for future requests.

So, concurrent execution creates new containers/instances, not new Lambda functions.

Total Concurrent Execution == Total Container Lambda?

Yes, exactly! The total number of concurrent executions at any given moment is equal to the total number of active Lambda containers. Each concurrent execution runs in its own container:

Visual Example:

Lambda Function "ProcessOrder"


    ├── Container 1 (handling request A) ← 1 concurrent execution
    ├── Container 2 (handling request B) ← 1 concurrent execution
    ├── Container 3 (handling request C) ← 1 concurrent execution
    └── Container 4 (warm, waiting)     ← 0 concurrent executions
    

Total: 3 concurrent executions = 3 active containers

Important Notes:

Your understanding is correct: the concurrent execution count directly equals the number of containers actively running at that moment.

Each New Container, Always Startup?

No, not always! Container creation involves two distinct phases:

Container Creation Process:

1. Container Startup (Always happens for a new container)

When a new container is provisioned, the following steps always occur:

This phase typically takes between ~100ms to 1 second, depending on the size of your deployment package and the complexity of the runtime environment.

2. Your Function Initialization (Depends on your code)

After the container startup, your function's initialization code runs. This includes:

The time taken for this phase varies greatly based on the complexity and dependencies of your function's initialization code (from 0ms to 10 seconds or more).

Cold Start = Container Startup + Function Initialization

The total time for a cold start is the sum of the container startup time and your function's initialization time.

Examples:

Simple Function (Fast startup):


    exports.handler = async (event) => {
        return { message: "Hello" }; // Very fast, minimal initialization
    }
    

Complex Function (Slower startup):


    const AWS = require('aws-sdk'); // Takes time to load module
    const db = new AWS.DynamoDB();   // Takes time to establish connection

    exports.handler = async (event) => {
        // Handler execution
    }
    

Key Points:

So, while every new container creation involves startup overhead, the duration of that overhead (the cold start) is significantly influenced by your function's initialization code.