one-line definition
A cold start is the delay that occurs when a serverless function or container spins up from scratch to handle a request, adding latency to the first call.
formula: Cold Start Latency = Time to initialize runtime + Time to load dependencies + Time to execute handler. Typically 100ms-5s depending on runtime and bundle size.
tl;dr
Cold starts are the hidden tax of serverless. Your API might respond in 80ms normally, but the first request after idle hits 2 seconds. If that first request is a user's first experience with your product, you just made a terrible first impression. Measure cold start latency separately from warm latency.
Simple definition
When you deploy to a serverless platform like AWS Lambda, Vercel, or Cloudflare Workers, your code does not run on a server that is always on. Instead, the platform spins up a fresh instance when a request comes in. If no instance is warm (recently used), the platform has to cold-start one: boot the runtime, load your code and dependencies, initialize your handler. That startup cost is the cold start, and it adds anywhere from 50ms to 5 seconds to the request. Subsequent requests reuse the warm instance and are fast. But after a period of inactivity (usually 5-15 minutes), the instance shuts down and the next request cold-starts again.
How to calculate it
Cold Start Latency = Runtime initialization + Dependency loading + Handler execution (first run)
Measure it by checking the latency of your first request after a period of inactivity vs. a warm request:
- Deploy your function, wait 20 minutes, then send a request. Note the latency.
- Send another request immediately after. Note the latency.
- The difference is your cold start overhead.
Track cold start frequency in your monitoring. If 30% of your requests are cold starts, you have an architecture problem. If it is under 5%, cold starts are not your bottleneck.
Example
You build a webhook processing API on AWS Lambda with Node.js. Warm response time: 45ms. After 15 minutes of no traffic (common during off-hours), the next request takes 1,200ms. Your bundle is 18MB because you included the entire AWS SDK and a PDF generation library. You tree-shake the SDK (importing only the S3 client instead of the full package) and lazy-load the PDF library only for the endpoints that need it. Bundle drops to 3MB. Cold start drops from 1,200ms to 280ms. For your European users who hit the API first thing in the morning after overnight idle, the experience goes from noticeably slow to nearly instant.
Related reading
Related terms
- API Latency
- Time to Value
- Activation Rate
FAQ
Which serverless platforms have the worst cold starts?+
AWS Lambda with Java or .NET runtimes can hit 3-5 seconds. Node.js and Python are faster at 100-500ms. Vercel and Cloudflare Workers are the fastest — often under 50ms because they use V8 isolates instead of full containers. If cold starts matter to your product, your runtime and platform choice are the biggest levers.
How do I avoid cold starts without paying for always-on servers?+
Keep your function bundles small (under 5MB). Use lightweight runtimes like Node.js or Go. On AWS Lambda, use provisioned concurrency for critical endpoints. Or use a cron job to ping your functions every 5 minutes to keep them warm — a hack, but it works and costs almost nothing.