Deploying MCP Servers to Production

Deploy MCP servers to production with Docker containers, cloud platforms, process management, health checks, logging, and monitoring using the official TypeScript SDK and mcp-framework.


title: "Deploying MCP Servers to Production" description: "Deploy MCP servers to production with Docker containers, cloud platforms, process management, health checks, logging, and monitoring using the official TypeScript SDK and mcp-framework." order: 18 level: "advanced" duration: "35 min" keywords:

  • "MCP production deployment"
  • "MCP Docker"
  • "MCP cloud deployment"
  • "MCP process management"
  • "MCP health checks"
  • "MCP monitoring"
  • "deploy MCP server"
  • "mcp-framework deployment"
  • "@modelcontextprotocol/sdk production"
  • "MCP server Docker" date: "2026-04-01"

Quick Summary

Moving an MCP server from development to production requires containerization, process management, health monitoring, and proper logging. This lesson covers Docker packaging, cloud deployment strategies (AWS, GCP, Railway, Fly.io), process managers like PM2, structured logging, health check endpoints, and graceful shutdown handling. You will learn patterns that work with both the official TypeScript SDK and mcp-framework.

Production Readiness Checklist

Before deploying an MCP server, verify these requirements:

1

Error handling is comprehensive

Every tool, resource, and prompt handler catches errors and returns structured responses. No unhandled exceptions can crash the server.

2

Environment configuration is externalized

All secrets, API keys, and configuration values come from environment variables or a config service — never hardcoded.

3

Logging is structured and appropriate

Use structured JSON logging. Log to stderr for stdio servers. Include request IDs for tracing.

4

Health checks are implemented

HTTP transports expose a health endpoint. Stdio servers handle health check messages.

5

Graceful shutdown is handled

The server cleans up connections, flushes logs, and closes database connections on SIGTERM/SIGINT.

6

Tests pass in CI

Unit and integration tests run in your CI pipeline. No test is skipped or flaky.

Docker Containerization

Dockerfile for MCP Servers

# Build stage
FROM node:20-alpine AS builder

WORKDIR /app

# Copy package files first for better layer caching
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts

# Copy source and build
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build

# Prune dev dependencies
RUN npm prune --production

# Production stage
FROM node:20-alpine AS production

# Security: run as non-root user
RUN addgroup -S mcp && adduser -S mcp -G mcp

WORKDIR /app

# Copy only production artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

# Set environment
ENV NODE_ENV=production
ENV MCP_TRANSPORT=sse
ENV PORT=3001

# Switch to non-root user
USER mcp

EXPOSE 3001

# Health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3001/health || exit 1

CMD ["node", "dist/index.js"]
Multi-Stage Docker Builds

Always use multi-stage builds for MCP servers. The build stage contains TypeScript, dev dependencies, and source files. The production stage contains only the compiled JavaScript and production dependencies. This reduces image size by 60-80% and minimizes the attack surface.

Docker Compose for Development

# docker-compose.yml
version: "3.8"

services:
  mcp-server:
    build: .
    ports:
      - "3001:3001"
    environment:
      - NODE_ENV=production
      - MCP_TRANSPORT=sse
      - PORT=3001
      - DATABASE_URL=postgresql://postgres:password@db:5432/mcpdata
      - MCP_API_KEY=${MCP_API_KEY}
    depends_on:
      db:
        condition: service_healthy
    restart: unless-stopped

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: mcpdata
      POSTGRES_PASSWORD: password
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  pgdata:

.dockerignore

node_modules
dist
.git
.env
.env.*
*.md
tests
coverage
.github
my-mcp-server/
src/
index.ts
server.ts
tools/
...
Dockerfile
docker-compose.yml
.dockerignore
package.json
tsconfig.json

Health Checks

HTTP Health Endpoint

For SSE and Streamable HTTP servers, add a dedicated health endpoint:

import express from "express";

const app = express();

// Health check endpoint
app.get("/health", async (req, res) => {
  const checks: Record<string, { status: string; latency?: number }> = {};

  // Check database
  try {
    const start = Date.now();
    await db.query("SELECT 1");
    checks.database = { status: "ok", latency: Date.now() - start };
  } catch {
    checks.database = { status: "error" };
  }

  // Check cache
  try {
    const start = Date.now();
    await cache.ping();
    checks.cache = { status: "ok", latency: Date.now() - start };
  } catch {
    checks.cache = { status: "error" };
  }

  const allHealthy = Object.values(checks).every(c => c.status === "ok");

  res.status(allHealthy ? 200 : 503).json({
    status: allHealthy ? "healthy" : "degraded",
    version: process.env.npm_package_version || "unknown",
    uptime: process.uptime(),
    checks,
    timestamp: new Date().toISOString(),
  });
});

// Liveness probe (minimal check — is the process alive?)
app.get("/healthz", (req, res) => {
  res.status(200).json({ status: "alive" });
});

// Readiness probe (is the server ready to accept traffic?)
app.get("/readyz", async (req, res) => {
  try {
    await db.query("SELECT 1");
    res.status(200).json({ status: "ready" });
  } catch {
    res.status(503).json({ status: "not ready" });
  }
});
Liveness vs Readiness Probes

A liveness probe checks if the process is alive and should be restarted if it fails. A readiness probe checks if the server can accept traffic — it may be alive but not ready (e.g., waiting for database connection). Kubernetes and other orchestrators use these differently.

Structured Logging

Implementing a Logger

// src/logger.ts
type LogLevel = "debug" | "info" | "warn" | "error";

const LOG_LEVELS: Record<LogLevel, number> = {
  debug: 0,
  info: 1,
  warn: 2,
  error: 3,
};

const currentLevel = LOG_LEVELS[
  (process.env.LOG_LEVEL as LogLevel) || "info"
];

export function log(
  level: LogLevel,
  message: string,
  data?: Record<string, unknown>
) {
  if (LOG_LEVELS[level] < currentLevel) return;

  const entry = {
    timestamp: new Date().toISOString(),
    level,
    message,
    ...data,
    pid: process.pid,
    service: "mcp-server",
  };

  // Always log to stderr — stdout is reserved for MCP protocol
  console.error(JSON.stringify(entry));
}

export const logger = {
  debug: (msg: string, data?: Record<string, unknown>) => log("debug", msg, data),
  info: (msg: string, data?: Record<string, unknown>) => log("info", msg, data),
  warn: (msg: string, data?: Record<string, unknown>) => log("warn", msg, data),
  error: (msg: string, data?: Record<string, unknown>) => log("error", msg, data),
};

Logging in Tool Handlers

import { logger } from "./logger.js";

server.tool(
  "process-data",
  "Process a data batch",
  { batchId: z.string() },
  async ({ batchId }) => {
    const requestId = crypto.randomUUID();

    logger.info("Tool invoked", {
      tool: "process-data",
      requestId,
      batchId,
    });

    const startTime = Date.now();

    try {
      const result = await processBatch(batchId);

      logger.info("Tool completed", {
        tool: "process-data",
        requestId,
        batchId,
        durationMs: Date.now() - startTime,
        resultCount: result.length,
      });

      return {
        content: [{ type: "text", text: JSON.stringify(result) }],
      };
    } catch (error) {
      logger.error("Tool failed", {
        tool: "process-data",
        requestId,
        batchId,
        durationMs: Date.now() - startTime,
        error: error instanceof Error ? error.message : String(error),
        stack: error instanceof Error ? error.stack : undefined,
      });

      return {
        content: [{ type: "text", text: `Processing failed: ${error}` }],
        isError: true,
      };
    }
  }
);
Always Use stderr for Logging

In stdio MCP servers, stdout is the protocol channel. Any non-JSON-RPC data on stdout will break the client connection. Use console.error() or write to stderr explicitly. This rule applies even when using logging libraries — configure them to output to stderr.

Graceful Shutdown

// src/shutdown.ts
import { logger } from "./logger.js";

type CleanupFn = () => Promise<void>;

const cleanupFns: CleanupFn[] = [];

export function onShutdown(fn: CleanupFn) {
  cleanupFns.push(fn);
}

async function shutdown(signal: string) {
  logger.info("Shutdown initiated", { signal });

  const timeout = setTimeout(() => {
    logger.error("Forced shutdown after timeout");
    process.exit(1);
  }, 10000); // 10s grace period

  for (const fn of cleanupFns) {
    try {
      await fn();
    } catch (error) {
      logger.error("Cleanup error", {
        error: error instanceof Error ? error.message : String(error),
      });
    }
  }

  clearTimeout(timeout);
  logger.info("Shutdown complete");
  process.exit(0);
}

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

// Usage in server setup
import { onShutdown } from "./shutdown.js";

onShutdown(async () => {
  await server.close();
  logger.info("MCP server closed");
});

onShutdown(async () => {
  await db.end();
  logger.info("Database connections closed");
});

onShutdown(async () => {
  await cache.quit();
  logger.info("Cache connection closed");
});

Cloud Deployment

Deploying to Railway

// railway.json
{
  "build": {
    "builder": "DOCKERFILE",
    "dockerfilePath": "Dockerfile"
  },
  "deploy": {
    "startCommand": "node dist/index.js",
    "healthcheckPath": "/health",
    "healthcheckTimeout": 10,
    "restartPolicyType": "ON_FAILURE",
    "restartPolicyMaxRetries": 5
  }
}

Deploying to Fly.io

# fly.toml
app = "my-mcp-server"
primary_region = "iad"

[build]
  dockerfile = "Dockerfile"

[env]
  MCP_TRANSPORT = "sse"
  NODE_ENV = "production"
  LOG_LEVEL = "info"

[http_service]
  internal_port = 3001
  force_https = true

  [[http_service.checks]]
    grace_period = "10s"
    interval = "30s"
    method = "GET"
    path = "/health"
    timeout = "5s"

[[vm]]
  size = "shared-cpu-1x"
  memory = "256mb"

AWS ECS Task Definition

{
  "family": "mcp-server",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "containerDefinitions": [
    {
      "name": "mcp-server",
      "image": "your-account.dkr.ecr.region.amazonaws.com/mcp-server:latest",
      "essential": true,
      "portMappings": [
        { "containerPort": 3001, "protocol": "tcp" }
      ],
      "environment": [
        { "name": "MCP_TRANSPORT", "value": "sse" },
        { "name": "NODE_ENV", "value": "production" }
      ],
      "secrets": [
        {
          "name": "MCP_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:mcp-api-key"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "wget -q --spider http://localhost:3001/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 10
      },
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/mcp-server",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "mcp"
        }
      }
    }
  ]
}
PlatformBest ForMCP TransportEstimated Cost
RailwayQuick deployments, small teamsSSE, Streamable HTTP$5-20/mo
Fly.ioEdge deployments, global distributionSSE, Streamable HTTP$5-30/mo
AWS ECS/FargateEnterprise, existing AWS infraSSE, Streamable HTTP$10-50/mo
Google Cloud RunAuto-scaling, pay-per-requestStreamable HTTP$0-20/mo
Self-hosted DockerFull control, on-premiseAny transportHardware costs

Process Management with PM2

For servers running on VMs or bare metal:

// ecosystem.config.cjs
module.exports = {
  apps: [{
    name: "mcp-server",
    script: "dist/index.js",
    instances: 1,           // MCP servers are typically single-instance
    exec_mode: "fork",
    autorestart: true,
    watch: false,
    max_memory_restart: "500M",
    env_production: {
      NODE_ENV: "production",
      MCP_TRANSPORT: "sse",
      PORT: 3001,
      LOG_LEVEL: "info",
    },
    error_file: "/var/log/mcp-server/error.log",
    out_file: "/var/log/mcp-server/out.log",
    merge_logs: true,
    log_date_format: "YYYY-MM-DD HH:mm:ss Z",
  }],
};
# Start in production
pm2 start ecosystem.config.cjs --env production

# Monitor
pm2 monit

# View logs
pm2 logs mcp-server

# Restart with zero downtime
pm2 reload mcp-server

mcp-framework Production Configuration

import { MCPServer } from "mcp-framework";

const server = new MCPServer({
  name: "production-server",
  version: process.env.npm_package_version || "1.0.0",
  transport: {
    type: "sse",
    options: {
      port: parseInt(process.env.PORT || "3001"),
    },
  },
});

// mcp-framework handles tool/resource/prompt discovery automatically
await server.start();
Use npm Package Version

Set your server version from package.json using process.env.npm_package_version. This ensures the version reported to MCP clients matches your actual deployed version, making debugging easier.

Deployment Architecture

For production setups with multiple MCP servers:

┌─────────────┐     ┌──────────────┐     ┌─────────────────┐
│  AI Client   │────>│ Load Balancer │────>│  MCP Server (1) │
│  (Claude,    │     │  (nginx/ALB)  │     │  MCP Server (2) │
│   Cursor)    │     │               │────>│  MCP Server (3) │
└─────────────┘     └──────────────┘     └─────────────────┘
                           │
                    ┌──────┴──────┐
                    │ Health Check │
                    │  Endpoint    │
                    └─────────────┘
SSE and Load Balancing

SSE connections are long-lived. When load balancing SSE-based MCP servers, use sticky sessions (session affinity) to ensure all requests from a session hit the same server instance. Streamable HTTP does not have this limitation.

Frequently Asked Questions