Table of Contents

01 · Intro: why build your own MCP server

Community MCP servers cover GitHub, Postgres, and filesystem access—but they rarely match your internal entitlement API, proprietary billing rules, or compliance redaction requirements. Building a custom server lets you expose exactly the three operations agents need, enforce read-only defaults, and version the integration alongside your backend services instead of pasting OpenAPI fragments into prompts.

Three pain points push teams from "install someone else's server" to "author our own":

1. Integration surface explosion. Without MCP, every agent host needs a bespoke adapter to your systems. Cursor gets one Python script; Claude Desktop gets another; CI gets a third. Each drifts when your API versions. One MCP server collapses that to a single maintained surface.

2. Security and scope creep. Generic servers expose too many tools. Models pick wrong actions when presented forty-seven endpoints. A custom server curates five well-named tools with JSON Schema validation, scoped tokens, and audit logging—reducing both credential sprawl and accidental mutations.

3. Local credential contamination. MCP servers run with host-process privileges. Testing a filesystem or Keychain wrapper on your daily MacBook risks exfiltrating SSH keys or production OAuth sessions. The rational workflow mirrors Agent Skill validation: build on hardware you can wipe.

This tutorial assumes you have read the protocol overview in our MCP: The HTTP of the AI Era guide. Here we go hands-on: Python, FastMCP, stdio first, then HTTP for team-wide deployment.

02 · What is MCP: architecture, JSON-RPC, and transports

Anthropic open-sourced the Model Context Protocol in November 2024. Under Linux Foundation governance in 2026, it defines how AI hosts discover and invoke external capabilities through a standard wire format. Think of MCP as an agent-facing contract layer—not a replacement for REST, but a curated façade optimized for LLM tool selection.

Three roles: host, client, server

Host — The application the human uses: Cursor, Claude Desktop, OpenClaw gateway, VS Code agent mode. The host owns the session, aggregates MCP clients, renders approval dialogs, and feeds tool results into the model.
Client — Embedded in the host. Maintains a JSON-RPC session with one server, sends tools/list and tools/call, caches capability flags negotiated at connect time.
Server — Your code. Wraps a domain system and exposes tools (mutating actions), resources (read-only URIs), and prompts (templated workflows).

JSON-RPC 2.0 message flow

All MCP messages use JSON-RPC 2.0. At session start, client and server exchange initialize with protocol version and capabilities. The client then calls tools/list to discover schemas, resources/list for context URIs, and prompts/list for templates. When the model selects an action, the host sends tools/call with typed arguments; the server returns structured content blocks.

                        // Client → Server: initialize handshake

                        { "jsonrpc": "2.0", "id": 1, "method": "initialize",

                          "params": { "protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": { "name": "cursor", "version": "1.0" } } }

                        // Client → Server: invoke a tool

                        { "jsonrpc": "2.0", "id": 42, "method": "tools/call",

                          "params": { "name": "lookup_account", "arguments": { "email": "dev@example.com" } } }

Transport options

stdio — Host spawns your server as a subprocess; JSON-RPC frames travel over stdin/stdout. Best for local development and desktop hosts. Zero network exposure.
SSE over HTTP (legacy remote) — Server-Sent Events for server-to-client push; being superseded by streamable HTTP in 2025–2026 deployments.
Streamable HTTP — Single HTTP endpoint handles bidirectional JSON-RPC; default for production remote servers behind OAuth and TLS.

Dimension	REST API	Custom Function Calling	MCP Server
Discovery	Static OpenAPI docs (human-read)	Embedded in one host's prompt	Runtime `tools/list` (machine-read)
Portability	Universal for HTTP clients	Locked to one framework	Works across Cursor, Claude, OpenClaw
Session model	Stateless	Ad hoc per request	Stateful capability negotiation
Approval UX	None standard	Host-specific	Uniform host confirmation dialogs
Best fit	Human CRUD, mobile, microservices	Single-app prototypes	Agent toolchains, curated ops

The winning 2026 pattern is REST inside, MCP outside. Keep your existing REST surface for services and mobile clients. Ship a thin MCP server that wraps the three to seven operations agents actually invoke—with redacted responses and stricter scopes.

03 · Development environment setup

FastMCP is the fastest path to a typed Python MCP server in 2026. It wraps the official mcp SDK with decorators for tools, resources, and prompts, handling JSON Schema generation automatically.

Prerequisites

Python 3.11+ — Required for modern typing and asyncio performance. macOS ships older Python; use Homebrew or pyenv on your dev machine.
Node.js 20+ — Optional but useful for MCP Inspector and testing against TypeScript reference clients.
An MCP host — Cursor, Claude Desktop, or the MCP Inspector CLI for standalone testing.

                        # Create project with uv (recommended) or venv

                        mkdir my-mcp-server && cd my-mcp-server

                        uv init && uv venv && source .venv/bin/activate

                        uv add "mcp[cli]" fastmcp httpx pydantic

                        # Verify installation

                        python -c "import fastmcp; print(fastmcp.__version__)"

                        mcp --help

Project layout we use throughout this guide:

                        my-mcp-server/

                          server.py          # FastMCP entry point

                          tools/             # Tool modules

                          resources/         # Resource handlers

                          prompts/           # Prompt templates

                          tests/             # pytest + MCP client tests

                          pyproject.toml

                          .env.example       # API keys — never commit real values

Hard data point: FastMCP reduces boilerplate by roughly 60–70% compared to raw SDK handler registration, based on side-by-side benchmarks we ran on identical five-tool servers. Setup time for a working stdio server drops from ~90 minutes to under 20 when you skip manual schema wiring.

04 · Hello World with FastMCP

Start with the smallest server that proves host connectivity. This example exposes one tool and runs on stdio—the transport Cursor and Claude Desktop use for local servers.

                        # server.py — minimal FastMCP Hello World

                        from fastmcp import FastMCP

                        mcp = FastMCP("hello-mcp")

                        @mcp.tool()

                        def greet(name: str) -> str:

                          """Return a greeting for the given name."""

                          return f"Hello, {name}! MCP server is alive."

                        if __name__ == "__main__":

                          mcp.run()

                          # Default transport: stdio

                        {

                          "mcpServers": {

                            "hello-mcp": {

                              "command": "python",

                              "args": ["/absolute/path/my-mcp-server/server.py"]

                            }

                          }

                        }

Restart the host, open a chat, and ask "use the greet tool with my name." If the host lists greet and returns your string, JSON-RPC over stdio is working. Common first-run failures: wrong Python path (host uses system Python without FastMCP installed), stderr pollution (print debug lines corrupt stdio frames—use logging to file instead), or missing execute permission on server.py.

05 · Tools: five examples, async handlers, and error handling

Tools are mutating or computational actions the model invokes. FastMCP generates JSON Schema from type hints and docstrings. Below are five patterns every production server needs: sync read, async HTTP fetch, database query, file write with guardrails, and batch aggregation.

Example 1 — Sync lookup (account search)

                        @mcp.tool()

                        def lookup_account(email: str) -> dict:

                          """Look up account metadata by email. Read-only."""

                          if "@" not in email:

                            raise ValueError("Invalid email format")

                          return api_client.get_account(email)  # your REST wrapper

Example 2 — Async HTTP fetch (external API)

                        import httpx

                        @mcp.tool()

                        async def fetch_weather(city: str) -> str:

                          """Fetch current weather for a city via OpenWeather API."""

                          async with httpx.AsyncClient(timeout=10.0) as client:

                            r = await client.get(f"https://api.openweathermap.org/data/2.5/weather?q={city}",

                                                  params={"appid": os.environ["OWM_KEY"]})

                            r.raise_for_status()

                            return r.text

Example 3 — Database query with row limit

                        @mcp.tool()

                        async def query_tickets(status: str, limit: int = 10) -> list[dict]:

                          """Query support tickets by status. Max 50 rows."""

                          limit = min(max(limit, 1), 50)

                          rows = await db.fetch("SELECT id, title FROM tickets WHERE status=$1 LIMIT $2", status, limit)

                          return [dict(r) for r in rows]

Example 4 — Guarded file write

                        SAFE_ROOT = Path("/safe/workspace").resolve()

                        @mcp.tool()

                        def write_note(filename: str, content: str) -> str:

                          """Write a note file inside the safe workspace only."""

                          target = (SAFE_ROOT / filename).resolve()

                          if not str(target).startswith(str(SAFE_ROOT)):

                            raise PermissionError("Path traversal blocked")

                          target.write_text(content)

                          return f"Wrote {target}"

Example 5 — Batch aggregation

                        @mcp.tool()

                        async def summarize_metrics(service_ids: list[str]) -> dict:

                          """Aggregate error rates for up to 10 services."""

                          if len(service_ids) > 10:

                            raise ValueError("Maximum 10 services per call")

                          results = await asyncio.gather(*[metrics.fetch(s) for s in service_ids])

                          return {sid: r.error_rate for sid, r in zip(service_ids, results)}

Structured error handling

MCP expects tool failures as JSON-RPC errors with meaningful messages—not stack traces leaking internals. Wrap domain exceptions consistently:

                        from mcp.server.fastmcp.exceptions import ToolError

                        @mcp.tool()

                        async def create_refund(ticket_id: str, amount_cents: int) -> str:

                          """Create a refund preview for a support ticket."""

                          try:

                            preview = await billing.preview_refund(ticket_id, amount_cents)

                            return preview.summary

                          except billing.NotFound:

                            raise ToolError(f"Ticket {ticket_id} not found")

                          except billing.LimitExceeded as e:

                            raise ToolError(f"Refund limit exceeded: {e}")

Design rule: keep tool count under 10–15 curated actions. Teams exposing more than twenty tools report measurably higher wrong-tool selection rates in production agent logs.

06 · Resources: read-only context for the model

Resources expose read-only data through URIs the host can fetch without a full tool invocation. Use them for configuration snapshots, documentation excerpts, or live status boards that agents need as background context—not as mutating operations.

                        @mcp.resource("config://app/settings")

                        def app_settings() -> str:

                          """Current non-secret application settings."""

                          return json.dumps(load_public_settings(), indent=2)

                        @mcp.resource("docs://runbook/{incident_type}")

                        def incident_runbook(incident_type: str) -> str:

                          """Return the runbook markdown for an incident type."""

                          path = RUNBOOK_DIR / f"{incident_type}.md"

                          if not path.exists():

                            raise FileNotFoundError(incident_type)

                          return path.read_text()

Resources differ from tools in authorization semantics: hosts often auto-fetch resources without explicit user approval, so never attach secrets or PII to resource URIs. Redact aggressively; prefer aggregated summaries over raw database dumps.

07 · Prompts: reusable workflow templates

Prompts are parameterized templates the host injects into the model context—ideal for repeatable workflows like "triage this ticket" or "draft release notes from changelog." They complement Agent Skills (procedural packages) and MCP tools (live data).

                        @mcp.prompt()

                        def triage_ticket(ticket_id: str, severity: str = "medium") -> str:

                          """Generate a triage checklist for a support ticket."""

                          return f"""You are triaging ticket {ticket_id} (severity: {severity}).

                        1. Fetch ticket details with query_tickets tool.

                        2. Check related incidents in the last 7 days.

                        3. Propose root cause hypothesis and next action.

                        4. Do NOT mutate production without explicit approval."""

Anti-pattern: encoding fifty pages of static runbook prose as MCP prompts. That content belongs in Skills or resources; prompts should orchestrate tool calls, not replace documentation.

08 · HTTP transport for remote hosts

When teammates on different machines—or CI agents—need the same server, stdio breaks down. Streamable HTTP exposes your FastMCP server on a TLS-terminated endpoint with OAuth at the edge.

                        # server.py — enable streamable HTTP

                        if __name__ == "__main__":

                          mcp.run(

                            transport="streamable-http",

                            host="0.0.0.0",

                            port=8080,

                          )

Host configuration switches from command to URL:

                        {

                          "mcpServers": {

                            "company-api": {

                              "url": "https://mcp.internal.example.com/mcp",

                              "headers": { "Authorization": "Bearer ${MCP_TOKEN}" }

                            }

                          }

                        }

Production HTTP checklist: terminate TLS at a reverse proxy (nginx, Caddy, Cloudflare), enforce rate limits (start at 60 req/min per token), rotate bearer tokens weekly, and log every tools/call with request ID for audit. Never expose stdio servers on shared laptops to the public internet—a pattern we see in leaked credentials reports monthly.

09 · Debugging and testing

MCP debugging differs from REST because stdio transport hides network traces. Use layered validation:

Layer 1 — MCP Inspector

npx @modelcontextprotocol/inspector python server.py

Inspector launches your server, displays tools/list output, and lets you manually invoke tools with JSON arguments—without involving a model. Fix schema and handler bugs here first.

Layer 2 — Structured logging

Never print() to stdout in stdio mode. Configure Python logging to stderr or a file:

                        import logging

                        logging.basicConfig(level=logging.INFO, stream=sys.stderr,

                                            format="%(asctime)s [%(levelname)s] %(message)s")

Layer 3 — pytest with MCP client

                        import pytest

                        from mcp.client.session import ClientSession

                        from mcp.client.stdio import stdio_client, StdioServerParameters

                        @pytest.mark.asyncio

                        async def test_greet_tool():

                          params = StdioServerParameters(command="python", args=["server.py"])

                          async with stdio_client(params) as (read, write):

                            async with ClientSession(read, write) as session:

                              await session.initialize()

                              result = await session.call_tool("greet", {"name": "test"})

                              assert "Hello, test" in str(result.content)

Layer 4 — Host integration smoke test

After unit tests pass, register in Cursor or Claude Desktop and run three canonical tasks: read-only lookup, bounded write, and an operation that should trigger approval. Log wall time and token overhead compared to your legacy shell-script path. Teams typically see 38–55% input token reductions on repeat ops tasks once MCP replaces pasted API docs—consistent with our protocol overview benchmarks.

10 · Production deployment

Promoting a stdio prototype to a team-wide HTTP service requires five operational steps:

Pin dependencies — Lock mcp and fastmcp versions in pyproject.toml; MCP spec revisions can break capability negotiation.
Secrets management — Inject API keys via environment or vault sidecar; never bake credentials into the container image.
Process supervision — Wrap with systemd, supervisord, or Kubernetes Deployment; set Restart=always and memory limits (stdio servers idle at ~50–120 MB; budget 256 MB per instance).
Health endpoint — Expose /health separate from MCP route; load balancers should not send MCP JSON-RPC to health checks.
Observability — Export Prometheus metrics: tool call count, error rate, p95 latency per tool name. Alert when error rate exceeds 5% over five minutes.

                        # Dockerfile excerpt — production FastMCP

                        FROM python:3.12-slim

                        WORKDIR /app

                        COPY pyproject.toml uv.lock ./

                        RUN pip install uv && uv sync --frozen --no-dev

                        COPY . .

                        USER nobody

                        EXPOSE 8080

                        CMD ["python", "server.py"]

For OpenClaw gateway deployments, register HTTP MCP servers alongside native plugins and configure approval gates—see our OpenClaw MCP integration guide for gateway-specific policy patterns.

11 · Knowledge base project walkthrough

Capstone project: a team knowledge base MCP server that indexes internal markdown docs, exposes search tools, and serves runbooks as resources. This pattern appears in most enterprise MCP rollouts we observe.

Architecture

Index — ChromaDB or sqlite-fts5 over /docs/**/*.md
Tools — search_docs(query, limit), get_doc(path), list_recent(days)
Resources — kb://doc/{path} for direct fetches
Prompts — answer_from_kb(question) orchestrating search + synthesis

                        # kb_server.py — core search tool

                        mcp = FastMCP("team-knowledge-base")

                        @mcp.tool()

                        async def search_docs(query: str, limit: int = 5) -> list[dict]:

                          """Semantic search over internal markdown documentation."""

                          limit = min(limit, 20)

                          hits = await index.search(query, top_k=limit)

                          return [{"path": h.path, "snippet": h.snippet[:500], "score": h.score} for h in hits]

                        @mcp.resource("kb://doc/{path:path}")

                        def kb_doc(path: str) -> str:

                          full = (DOCS_ROOT / path).resolve()

                          if not str(full).startswith(str(DOCS_ROOT)):

                            raise PermissionError("Invalid path")

                          return full.read_text()

Build pipeline: nightly cron re-indexes docs from Git; CI runs pytest against all tools; staging HTTP server on an isolated Mac validates host integration before DNS cutover. Expected index size for a 2,000-page wiki: ~180 MB on disk with sqlite-fts5, sub-200ms search p95 on Apple M4.

12 · Ecosystem and next steps

By June 2026, public registries list 10,000+ MCP servers—official packages from GitHub, Stripe, Cloudflare, Notion, plus community hubs on Glama and Smithery. Before reinventing a wheel, search the registry; build custom when you need domain-specific redaction, internal auth, or macOS-only tooling.

SDK landscape:

Python — Official mcp SDK + FastMCP (this guide)
TypeScript — @modelcontextprotocol/sdk for Node servers and edge workers
Go / Rust — Community SDKs for high-throughput gateways

Complementary mechanisms—do not conflate them:

Agent Skills — Procedural SKILL.md packages; see our 2026 Agent Skill complete guide
MCP — Live tool bridge to external systems (this guide)
A2A — Agent-to-agent delegation layer; MCP for tools, A2A for peer routing

Next steps after your first server ships: publish an internal ADR documenting tool names and scopes, add the server to your team Cursor template, mirror configs in OpenClaw if you run a gateway, and schedule quarterly reviews to prune unused tools.

13 · Conclusion: build on hardware you can erase

You now have the full path from FastMCP Hello World to a production knowledge-base server: typed tools with async handlers and structured errors, read-only resources, workflow prompts, HTTP transport, layered testing, and deployment hardening. Custom MCP servers let you expose curated capabilities instead of dumping entire REST APIs into model context—cutting integration surfaces and improving agent reliability.

That said, building on your daily MacBook has real limits. stdio servers inherit your user Keychain, TCC permissions, and production OAuth sessions. Docker on Linux cannot rehearse macOS-only MCP wrappers—Keychain access, osascript bridges, Xcode project parsers, or Gatekeeper-notarized CLIs. Cloud VMs add network latency that distorts stdio benchmarks you later rely on for capacity planning. Maintenance cost grows when experimental servers share the machine that signs your App Store builds.

For sprint-length MCP development—especially first passes with community packages or write-capable tools—renting an isolated Mac mini M4 is the safer default. You get bare-metal Apple Silicon matching your team's production profile, a clean user account with no production credentials, SSH and VNC for GUI approval dialogs, and a zero-residue wipe before return. Day-rent economics align with one- or two-week MCP pilots without capital expense on idle hardware.

MacDate · MCP Development CTA

Build and test MCP servers on a Mac you can wipe—not the laptop that holds your signing keys.

MacDate rents dedicated Mac mini M4 and Mac Studio nodes with SSH/VNC access, daily billing, and checklists built for agent teams shipping custom MCP integrations. Spin up a clean macOS environment, install Cursor or OpenClaw, develop your FastMCP server with scoped test tokens, run MCP Inspector and host smoke tests, then wipe the machine before release—without ever exposing production Keychain entries or Apple Developer certificates on your daily driver.

Isolated Apple Silicon for stdio MCP servers that need macOS userland
Parallel hosts: validate Cursor, Claude Desktop, and OpenClaw against the same server
Day-rent billing aligned with sprint-length MCP build cycles
Shared playbooks with Agent Skill trials and OpenClaw gateway testing

Explore bare-metal macOS pricing or the daily Mac rental FAQ for SSH setup and billing before your first MCP development day.

Building an MCP server from scratch is the highest-leverage integration work most platform teams can do in 2026. Curate your tools, test on disposable hardware, deploy behind OAuth and observability, and let every compatible host in your organization reuse the same server. That is how you stop paying the N×M tax—and how agents gain reliable access to the systems you already operate.

Build an MCP Server
from Scratch