After building and contributing to several MCP servers, I've noticed patterns that separate well-designed servers from fragile ones. Here's what I've learned about building MCP servers that are maintainable, testable, and reliable.
What Makes a Good MCP Server
A good MCP server does one thing well. It's the Unix philosophy applied to AI tooling: do one job, do it reliably, compose with others.
The best servers I've worked with share these traits:
- Clear scope — A filesystem server handles files. A database server handles queries. No kitchen sinks.
- Predictable behavior — Same inputs produce same outputs (or explicit reasons why not)
- Graceful degradation — Failures are reported clearly, not swallowed silently
- Fast startup — AI assistants connect and disconnect frequently
The worst servers try to do everything. They expose 50 tools, require complex configuration, and break in mysterious ways.
Stateless vs Stateful Patterns
Most MCP servers should be stateless. Each tool call should be independent — no hidden state that affects behavior.
Stateless (Preferred)
@server.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "read_file":
path = arguments["path"]
content = Path(path).read_text()
return [TextContent(type="text", text=content)]Every call stands alone. The server doesn't remember previous calls or maintain context between requests. This makes servers:
- Easy to test (no setup/teardown)
- Safe to restart (no state lost)
- Simple to reason about (no hidden dependencies)
Stateful (When Necessary)
Some servers genuinely need state. Database connections, authenticated sessions, or long-running operations require persistence:
class DatabaseServer:
def __init__(self):
self.connections: dict[str, Connection] = {}
async def call_tool(self, name: str, arguments: dict):
if name == "connect":
conn_id = str(uuid4())
self.connections[conn_id] = await create_connection(arguments["dsn"])
return [TextContent(type="text", text=f"Connected: {conn_id}")]
if name == "query":
conn = self.connections.get(arguments["connection_id"])
if not conn:
raise ValueError("Connection not found — call 'connect' first")
# ...If you must be stateful:
- Make state explicit — Return handles/IDs that callers must pass back
- Handle cleanup — Implement timeouts, limits, or explicit close operations
- Document dependencies — If tool B requires tool A first, say so clearly
Tool Design Best Practices
One Tool, One Job
Split complex operations into composable tools:
# Bad: One mega-tool
Tool(name="manage_user", description="Create, update, delete, or query users")
# Good: Focused tools
Tool(name="create_user", description="Create a new user account")
Tool(name="get_user", description="Get user details by ID")
Tool(name="update_user", description="Update user fields")
Tool(name="delete_user", description="Delete a user account")AIs handle multiple small tools better than one complex tool with mode switches.
Descriptions Are Documentation
The AI reads your descriptions to decide when to call tools. Be specific:
# Bad
Tool(name="search", description="Search for things")
# Good
Tool(
name="search_code",
description="Search for code patterns in the repository. "
"Supports regex patterns. Returns matching file paths "
"and line numbers. Use for finding function definitions, "
"imports, or specific patterns."
)Include: what it does, what inputs mean, what output looks like, when to use it vs alternatives.
Sensible Defaults
Don't require parameters that have obvious defaults:
{
"properties": {
"path": {"type": "string", "description": "File path to read"},
"encoding": {"type": "string", "default": "utf-8"},
"max_lines": {"type": "integer", "default": 1000}
},
"required": ["path"] # Only truly required params
}Error Handling Patterns
Return Errors, Don't Throw
MCP tools should return errors as content, not raise exceptions (except for truly exceptional cases):
async def call_tool(self, name: str, arguments: dict):
if name == "read_file":
path = Path(arguments["path"])
if not path.exists():
return [TextContent(
type="text",
text=f"Error: File not found: {path}\n"
f"Suggestions: Check the path exists, verify spelling, "
f"or use 'list_directory' to see available files."
)]
if not path.is_file():
return [TextContent(
type="text",
text=f"Error: Path is a directory, not a file: {path}\n"
f"Use 'list_directory' to see contents."
)]
return [TextContent(type="text", text=path.read_text())]The AI needs actionable information to recover. "File not found" is useless. "File not found at /x/y/z — did you mean /x/y/a?" is helpful.
Validate Early
Check inputs before doing work:
async def call_tool(self, name: str, arguments: dict):
if name == "write_file":
path = arguments.get("path")
content = arguments.get("content")
# Validate before doing anything
if not path:
return error("Missing required parameter: path")
if not content:
return error("Missing required parameter: content")
if ".." in path:
return error("Path traversal not allowed")
# Now safe to proceed
Path(path).write_text(content)
return success(f"Wrote {len(content)} bytes to {path}")Testing MCP Servers
Unit Test Tool Logic
Extract tool logic into pure functions that are easy to test:
# server.py
def parse_csv_content(content: str, delimiter: str = ",") -> list[dict]:
"""Pure function — easy to test."""
reader = csv.DictReader(io.StringIO(content), delimiter=delimiter)
return list(reader)
@server.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "parse_csv":
rows = parse_csv_content(arguments["content"], arguments.get("delimiter", ","))
return [TextContent(type="text", text=json.dumps(rows))]
# test_server.py
def test_parse_csv_content():
result = parse_csv_content("name,age\nAlice,30\nBob,25")
assert result == [{"name": "Alice", "age": "30"}, {"name": "Bob", "age": "25"}]
def test_parse_csv_custom_delimiter():
result = parse_csv_content("name;age\nAlice;30", delimiter=";")
assert result[0]["name"] == "Alice"Integration Test the Protocol
Test the full request/response cycle:
import pytest
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
@pytest.fixture
async def client():
server_params = StdioServerParameters(
command="python",
args=["server.py"]
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
yield session
async def test_read_file_tool(client, tmp_path):
test_file = tmp_path / "test.txt"
test_file.write_text("hello world")
result = await client.call_tool("read_file", {"path": str(test_file)})
assert result.content[0].text == "hello world"
async def test_read_file_not_found(client):
result = await client.call_tool("read_file", {"path": "/nonexistent"})
assert "Error" in result.content[0].text
assert "not found" in result.content[0].text.lower()Test Error Paths
Error handling often has more code paths than the happy path. Test them all:
@pytest.mark.parametrize("path,expected_error", [
("/nonexistent", "not found"),
("/etc", "is a directory"),
("../../../etc/passwd", "traversal"),
])
async def test_read_file_errors(client, path, expected_error):
result = await client.call_tool("read_file", {"path": path})
assert expected_error in result.content[0].text.lower()Key Takeaways
- Prefer stateless — State adds complexity; avoid it unless necessary
- One tool, one job — Composable beats comprehensive
- Descriptions matter — They're how AIs decide to use your tools
- Return helpful errors — The AI needs to know how to recover
- Test the protocol — Not just the logic, but the full MCP flow
- Keep it simple — The best MCP servers are boring. They just work.
Building MCP servers is about building trust. When an AI uses your server, it should get predictable, reliable results. Every edge case you handle is one less failure mode in production.
For real-world examples, check out the official MCP servers repository — the filesystem and Git servers are particularly good references.