How do you monitor serverless functions?

Serverless functions can be monitored using three approaches: (1) Platform-native tools like Vercel Analytics, AWS CloudWatch, or Cloudflare Analytics — free but limited to basic metrics. (2) Lightweight SDKs like Nurbak Watch that run inside the function via instrumentation hooks — no agent, minimal cold start overhead. (3) Full APM tools like Datadog or New Relic with serverless-specific agents — comprehensive but add 200-800ms to cold starts.

Why is serverless monitoring different from traditional monitoring?

Serverless monitoring is different because: (1) there is no persistent server to install an agent on, (2) functions spin up and shut down per request, so agents lose state on every cold start, (3) cold starts add variable latency that traditional APM tools misattribute to application slowness, (4) concurrency is per-function, not per-host, so capacity metrics are different, and (5) costs scale per invocation, not per host, requiring different cost monitoring.

How do you track cold starts in serverless?

Track cold starts by comparing the initialization time of the first request to subsequent requests. On Vercel, use the instrumentation.ts hook to timestamp when register() runs vs when the first request arrives. On AWS Lambda, use the INIT_START log entry in CloudWatch or the initDuration field in Lambda Insights. On Cloudflare Workers, cold starts are negligible (V8 isolates start in under 5ms) but can be tracked via the first-request latency delta.

Serverless Monitoring: Vercel, Lambda & Cloudflare Workers Guide (2026)

You deploy your Next.js app to Vercel. You want to monitor your API routes. You look at Datadog's setup guide: "Install the Datadog Agent on your host machine."

You don't have a host machine. You have serverless functions that exist for 50 milliseconds, serve a request, and vanish. The agent has nowhere to live.

This is the fundamental problem with serverless monitoring: every tool assumes you have a server. On Vercel, Lambda, and Cloudflare Workers, you don't. This guide covers what actually works.

Why Serverless Breaks Traditional Monitoring

Traditional APM tools (Datadog, New Relic, Dynatrace) were built for a world of long-running servers. They assume:

A daemon process can run alongside your app (it can't — there's no host)
The process persists between requests (it doesn't — functions are ephemeral)
Initialization happens once (it doesn't — every cold start re-initializes)
You can install system-level agents (you can't — no SSH access to the runtime)

These assumptions break on serverless, creating five specific problems:

1. Cold starts corrupt your latency data

A cold start adds 200-2000ms to the first request after a function scales up. If your APM tool doesn't separate cold start latency from request processing latency, your P95 looks terrible even when your code is fast.

    // What the APM sees:
// Request 1 (cold):  1,450ms  ← 1,200ms cold start + 250ms processing
// Request 2 (warm):    85ms
// Request 3 (warm):    92ms
// Request 4 (warm):    78ms
// Request 5 (cold):  1,380ms  ← Another cold start
//
// P95: 1,420ms — looks broken
// Actual app performance: 85ms — perfectly healthy
//
// Without cold start separation, you're optimizing the wrong thing.

2. Agent initialization adds to cold starts

APM agents need to boot when your function starts. That initialization isn't free:

Agent	Init overhead	Impact on cold start
Datadog (`dd-trace`)	200-800ms	+40-160% on a 500ms cold start
New Relic	200-400ms	+40-80%
Sentry	50-150ms	+10-30%
Nurbak Watch	5-15ms	+1-3%
No agent	0ms	Baseline

For functions that cold-start frequently (low-traffic routes, edge functions, cron jobs), a 400ms agent overhead can double total response time.

3. Concurrency is per-function, not per-host

On a traditional server, you monitor CPU and memory to predict capacity. On serverless, each invocation gets its own isolated environment. "CPU usage" is meaningless. What matters is concurrent executions, throttling events, and per-invocation duration.

4. Logs are scattered across invocations

Each function invocation produces isolated logs. Correlating a user's request across multiple function invocations requires trace IDs — something that most serverless platforms don't provide natively.

5. Costs scale per invocation, not per host

A function that runs 1ms per request at 10,000 RPM costs differently than one that runs 500ms at 100 RPM. Traditional monitoring tracks host costs. Serverless monitoring needs to track cost per function, per invocation.

What to Monitor on Serverless

The five metrics that matter for serverless APIs:

Metric	Why it matters	How to track
Cold start frequency	How often your functions reinitialize	Compare init time vs request time
Cold start duration	How much latency cold starts add	Measure init phase separately
P95 latency (warm only)	True application performance	Exclude cold start requests
Error rate per function	Which routes are failing	Track 4xx/5xx per API route
Throttling / concurrency limits	When the platform rejects requests	Platform metrics (CloudWatch, Vercel logs)

Platform-Specific Monitoring

Vercel (Next.js)

Vercel provides basic analytics in the dashboard: function invocations, duration, and errors. But no per-route P95, no cold start tracking, and no real-time alerts.

For meaningful monitoring, use the Next.js instrumentation.ts hook — the official entry point for observability that runs once per function initialization:

    // instrumentation.ts
export async function register() {
  if (process.env.NEXT_RUNTIME === 'nodejs') {
    // This runs once per cold start
    // Initialize your monitoring here
    const { initWatch } = await import('@nurbak/watch')
    initWatch({ apiKey: process.env.NURBAK_WATCH_KEY })
  }
}

AWS Lambda

Lambda publishes metrics to CloudWatch natively: invocations, duration, errors, throttles, and concurrent executions. Enable Lambda Insights for enhanced metrics including memory usage and init duration.

    # Enable Lambda Insights via AWS CLI
aws lambda update-function-configuration \
  --function-name my-api \
  --layers arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension:38

Key CloudWatch metrics to alarm on:

Duration — P95 per function. Alert when 2x baseline.
Errors — Any sustained error rate above 0.1%.
Throttles — Any throttling means you're hitting concurrency limits.
ConcurrentExecutions — Track against your account limit (default 1,000).
InitDuration — Cold start time. Alert if increasing (usually means bundle size grew).

Cloudflare Workers

Workers use V8 isolates instead of containers. Cold starts are under 5ms (vs 200-2000ms on Lambda/Vercel). This changes the monitoring equation — cold starts are not a significant concern.

Monitor via the Cloudflare dashboard or Workers Analytics API:

    # Cloudflare Workers Analytics API
curl -X POST https://api.cloudflare.com/client/v4/graphql \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "query": "{ viewer { accounts(filter: {accountTag: \"ACCOUNT_ID\"}) { workersInvocationsAdaptive(limit: 10, filter: {datetime_gt: \"2026-03-31\"}) { sum { requests errors subrequests } quantiles { cpuTimeP50 cpuTimeP99 } } } } }"
  }'

Three Monitoring Approaches Compared

Approach	Cold start impact	Setup	Cost	Coverage
Platform native (Vercel Analytics, CloudWatch)	0ms	Automatic	Free-$10/mo	Basic: invocations, duration, errors
Lightweight SDK (Nurbak Watch, Sentry)	5-50ms	5-15 min	$0-29/mo	Per-route metrics, real-time alerts
Full APM (Datadog, New Relic)	200-800ms	1-4 hours	$200-800/mo	Full: traces, logs, infra, APM

Recommendation for most serverless teams: Start with platform-native metrics (free) + a lightweight SDK for per-route monitoring and alerts. Only add a full APM if you need distributed tracing across 20+ functions or detailed infrastructure profiling.

Nurbak Watch: Built for Serverless Next.js

Nurbak Watch was designed specifically for serverless Next.js deployments on Vercel. It uses the instrumentation.ts hook with minimal cold start impact:

    npm install @nurbak/watch

    // instrumentation.ts
import { initWatch } from '@nurbak/watch'

export function register() {
  initWatch({
    apiKey: process.env.NURBAK_WATCH_KEY,
  })
}

What you get:

Every API route auto-discovered and monitored
Cold start frequency and duration tracked separately
P50/P95/P99 latency from warm requests (not corrupted by cold starts)
Error rates per endpoint
Alerts via Slack, email, or WhatsApp in under 10 seconds
+5-15ms cold start overhead (vs 200-800ms for Datadog/New Relic)

Free during beta, $29/month after. No per-host pricing — because there is no host.

Serverless Monitoring: How to Track APIs on Vercel, Lambda & Cloudflare Workers

Why Serverless Breaks Traditional Monitoring

1. Cold starts corrupt your latency data

2. Agent initialization adds to cold starts

3. Concurrency is per-function, not per-host

4. Logs are scattered across invocations

5. Costs scale per invocation, not per host

What to Monitor on Serverless

Platform-Specific Monitoring

Vercel (Next.js)

AWS Lambda

Cloudflare Workers

Three Monitoring Approaches Compared

Nurbak Watch: Built for Serverless Next.js

Related Articles

Fabian Delgado

Start monitoring your APIs for free

Why Serverless Breaks Traditional Monitoring

1. Cold starts corrupt your latency data

2. Agent initialization adds to cold starts

3. Concurrency is per-function, not per-host

4. Logs are scattered across invocations

5. Costs scale per invocation, not per host

What to Monitor on Serverless

Platform-Specific Monitoring

Vercel (Next.js)

AWS Lambda

Cloudflare Workers

Three Monitoring Approaches Compared

Nurbak Watch: Built for Serverless Next.js

Related Articles

Fabian Delgado

Start monitoring your APIs for free

Read Next

SLO vs SLA vs SLI: What's the Difference? (With Examples)

MTTD Explained: How to Measure and Reduce Mean Time to Detect

The Incident Response Lifecycle for API Teams (5 Steps)