You just deployed your Next.js app to Vercel. Your team lead says you need monitoring. So you Google "Next.js APM" and find Datadog's setup guide.
Step 1: Install dd-trace. Step 2: Install the Datadog Agent on your server. Step 3: Configure datadog.yaml with 47 options. Step 4: Set up trace collection. Step 5: Configure log forwarding. Step 6: Realize the Datadog Agent needs a server to run on — and you're on Vercel.
You don't have a server. You have serverless functions. The agent has nowhere to live.
This is the fundamental mismatch between traditional APM tools and modern Next.js deployments. The tools were built for a world of VMs and containers. Your app lives in a different world.
What APM Agents Actually Do (and Cost)
An APM agent is a separate process that runs alongside your application. Datadog's agent, New Relic's daemon, Dynatrace's OneAgent — they all follow the same pattern:
- A daemon process runs on your host machine (200-500MB RAM)
- A language-specific library (
dd-trace,newrelic) instruments your code - The library sends trace data to the local daemon over a Unix socket or localhost
- The daemon batches, compresses, and forwards data to the vendor's cloud
This architecture made sense when everyone ran on EC2 instances or Kubernetes pods. It breaks down completely in three scenarios that Next.js developers hit constantly:
Problem 1: Serverless has no host machine
On Vercel, Netlify, or AWS Lambda, there is no persistent server. Each API route invocation is an isolated function. The agent daemon has nowhere to run. Some vendors offer "serverless mode" where the library sends data directly to the cloud — but that means a network request on every function invocation, adding 50-200ms of latency to every API call.
Problem 2: Cold starts get worse
APM libraries need to initialize when your function starts. Here's what happens during a cold start with dd-trace:
// What your code looks like
import tracer from 'dd-trace'
tracer.init() // This is not free
// What actually happens during init():
// 1. Load configuration (read env vars, parse options) ~20ms
// 2. Initialize span processors ~30ms
// 3. Set up monkey-patching for http, fetch, pg, etc. ~80ms
// 4. Establish connection to collector ~100ms
// 5. Load sampling rules ~15ms
// ─────────────────────────────────────────────────────────
// Total cold start overhead: ~245msOn a function that normally cold-starts in 300ms, you just added 80% more initialization time. And this happens on every cold start — which, on Vercel's free and Pro tiers, can be every few minutes for low-traffic routes.
Problem 3: Configuration complexity
A typical Datadog setup for a Next.js app requires:
DD_API_KEY— your API keyDD_SITE— the Datadog regionDD_SERVICE— your service nameDD_ENV— environment (production, staging)DD_VERSION— your app versionDD_TRACE_ENABLED— enable/disable tracingDD_LOGS_INJECTION— correlate logs with tracesDD_RUNTIME_METRICS_ENABLED— runtime statsDD_PROFILING_ENABLED— code profilingDD_TRACE_SAMPLE_RATE— sampling rate
That's 10 environment variables just to get started. Miss one and you get partial data. Set one wrong and you get a $2,000 bill from trace overages.
Compare that to what you actually need: "Tell me when my API routes are slow or broken."
The Agent Tax: What You're Really Paying
Beyond the technical friction, agents impose hidden costs that compound over time:
| Cost | Agent-Based (Datadog/New Relic) | Agentless (Lightweight SDK) |
|---|---|---|
| Memory overhead | 300-500MB (daemon) + 50-100MB (library) | < 5MB |
| Cold start penalty | +200-800ms | +5-15ms |
| Environment variables | 10-15 required | 1 (API key) |
| Config files | datadog.yaml / newrelic.js | None |
| Setup time | 2-4 hours | 5 minutes |
| Monthly cost (small team) | $71-300+/host/month | $0-29/month |
| Works on Vercel serverless | Partially (degraded mode) | Fully |
| Requires infrastructure team | Often yes | No |
For a solo developer or a team of five shipping a SaaS product, the agent model is overkill. You're paying enterprise complexity for a problem that has a much simpler solution.
What Agentless Monitoring Looks Like
Agentless monitoring flips the architecture. Instead of a daemon process + library + collector pipeline, you get a single lightweight SDK that runs inside your application process:
// The entire monitoring setup for a Next.js app:
// instrumentation.ts
import { initWatch } from '@nurbak/watch'
export function register() {
initWatch({
apiKey: process.env.NURBAK_WATCH_KEY,
})
}That's it. No daemon, no config file, no 10 environment variables. The SDK:
- Initializes in under 15ms (vs 200-800ms for APM agents)
- Uses less than 5MB of memory (vs 300-500MB for agents)
- Auto-discovers every API route in your Next.js app
- Batches and sends metrics asynchronously (zero impact on response time)
- Survives serverless cold starts because it's part of your function, not a separate process
The key insight: your Next.js app already knows everything about its API routes. It knows every request path, every response code, every error. You don't need a separate process to observe it — you just need to capture what's already there.
How It Works Under the Hood
Next.js 13.2+ introduced the instrumentation.ts hook specifically for observability. When your server starts, Next.js calls the register() function once. This is the official, supported entry point for monitoring.
An agentless SDK uses this hook to:
// Simplified view of what happens inside the SDK
export function initWatch(config: { apiKey: string }) {
// 1. Hook into Node.js HTTP handling
// Uses diagnostics_channel (Node 16+) — no monkey-patching
const channel = diagnostics_channel.subscribe('http.server.request')
// 2. For each request, capture timing and metadata
channel.onMessage((message) => {
const { request, response } = message
metrics.record({
path: request.url,
method: request.method,
status: response.statusCode,
duration: message.duration,
})
})
// 3. Batch and send every 10 seconds (non-blocking)
setInterval(() => {
const batch = metrics.flush()
if (batch.length > 0) {
// Fire-and-forget — does not block your API responses
fetch('https://api.nurbak.com/v1/ingest', {
method: 'POST',
body: JSON.stringify(batch),
headers: { 'Authorization': `Bearer ${config.apiKey}` },
}).catch(() => {}) // Silent failure — monitoring should never break your app
}
}, 10_000)
}The critical design decisions:
- No monkey-patching. APM agents rewrite your
http,fetch, and database modules at runtime. This causes version conflicts, breaks TypeScript types, and makes debugging harder.diagnostics_channelis the Node.js-native way to observe without modifying. - Async, batched sending. Metrics are buffered in memory and sent in batches. Individual API responses are never delayed by the monitoring system.
- Silent failure. If the monitoring endpoint is down, your app keeps running normally. Monitoring should observe, never interfere.
What You Get Without the Agent
With an agentless SDK like Nurbak Watch, every API route is tracked automatically:
- Latency percentiles — P50, P95, P99 for every endpoint. Real server-side timing, not synthetic pings.
- Error rates — 4xx and 5xx percentage per route, with automatic spike detection.
- Throughput — Requests per minute. See which endpoints are hot and which are idle.
- Cold start tracking — On Vercel, know exactly how often your functions cold-start and how much latency it adds.
- Instant alerts — Slack, email, or WhatsApp within 10 seconds of an incident. Not minutes — seconds.
No Grafana dashboards to build. No PromQL queries to learn. No time-series database to scale. You install it, deploy, and your monitoring is live.
When You Actually Need an Agent
To be fair, agents aren't always wrong. You might genuinely need a full APM agent if:
- You run on Kubernetes and need distributed tracing across 50+ microservices with automatic service maps.
- You need deep runtime profiling — CPU flame graphs, memory leak detection, garbage collection analysis.
- Compliance requires it — some SOC 2 or HIPAA implementations mandate specific APM tooling.
- You have a dedicated platform team that manages observability infrastructure full-time.
If none of these apply — if you're a team of 1-15 developers shipping a Next.js SaaS — you don't need an agent. You need visibility into your API routes with minimal friction.
Migration: Removing Your Agent in 10 Minutes
If you're currently running an APM agent with Next.js, here's how to switch:
Step 1: Remove the agent library
# Remove Datadog
npm uninstall dd-trace
# Or remove New Relic
npm uninstall newrelic
# Or remove Dynatrace
npm uninstall @dynatrace/oneagentStep 2: Clean up environment variables
# Remove from .env.local and Vercel dashboard:
# DD_API_KEY, DD_SITE, DD_SERVICE, DD_ENV, DD_VERSION,
# DD_TRACE_ENABLED, DD_LOGS_INJECTION, DD_RUNTIME_METRICS_ENABLED,
# DD_PROFILING_ENABLED, DD_TRACE_SAMPLE_RATE
# ... (you get the idea)Step 3: Install and configure Nurbak Watch
npm install @nurbak/watch// instrumentation.ts
import { initWatch } from '@nurbak/watch'
export function register() {
initWatch({
apiKey: process.env.NURBAK_WATCH_KEY,
})
}Step 4: Add one environment variable
# .env.local (or Vercel dashboard)
NURBAK_WATCH_KEY=your_api_key_hereStep 5: Deploy
Your cold starts just got 200-800ms faster. Your function memory usage just dropped. And you still have full visibility into every API route.
Get Started — Free During Beta
Nurbak Watch is in beta and completely free during launch. No credit card. No agent. No daemon. No 47-option config file.
One npm install. Five lines of code. Every API route monitored.
Your ops team (which is probably also you) will thank you.

