one-line definition
API latency is the time between sending a request to an API and receiving the response — it directly affects how fast your product feels to users.
formula: API Latency = Time of response received − Time of request sent. Measured in milliseconds (ms). Track p50 (median), p95, and p99 percentiles.
tl;dr
Users do not think about latency — they think about whether your app feels fast or slow. Every 100ms of API latency above 200ms erodes trust in your product. Measure p95 latency, not averages. Your average can look great while 5% of users have a terrible experience.
Simple definition
API latency is the round-trip time for a request: your app asks the server for something, the server processes it, and sends a response. That total time — measured in milliseconds — is the latency. It includes network transit, server processing, database queries, and any external API calls your server makes. For solo founders, latency matters because it directly shapes the user experience. A dashboard that loads in 150ms feels instant. The same dashboard at 2 seconds feels broken. Users will not tell you "your API is slow" — they will just stop using your product.
How to calculate it
API Latency = Response timestamp − Request timestamp
Measure it at three levels:
- p50 (median): Half of requests are faster than this. Your typical user experience.
- p95: 95% of requests are faster. Your baseline for "fast enough."
- p99: 99% are faster. Your worst-case regular experience.
Use tools like Vercel Analytics, Datadog, or even a simple middleware that logs response times. Track latency per endpoint — your /api/users might be 50ms while /api/reports is 800ms. Fix the slow ones first.
Example
You build a project management tool. Your main dashboard API call returns in 180ms on average. Feels fine in testing. But your p95 is 1,400ms — meaning 1 in 20 page loads takes over a second. You dig in and find the slow requests happen when users have 200+ tasks. The database query scans the full table instead of using an index. You add a composite index on (user_id, status, created_at). The p95 drops from 1,400ms to 220ms. Average barely changes — it was already fine. But the users who were having the worst experience now have a dramatically better one, and your 7-day retention in that cohort improves by 12%.
Related reading
Related terms
- Time to Value
- Session Duration
- Activation Rate
FAQ
What is acceptable API latency for a SaaS product?+
Under 200ms for user-facing actions (loading data, saving changes). Under 500ms for complex operations (search, report generation). Above 1 second and users start noticing. Above 3 seconds and they start leaving. Always measure p95, not just average — your slowest users have the worst experience.
How do I reduce API latency on a budget?+
Start with the biggest wins: add database indexes for slow queries, implement response caching for data that doesn't change every request, use a CDN for static assets, and move your server closer to your users geographically. These four changes alone can cut latency by 50-80% for most bootstrapped apps.