Building custom Model Context Protocol servers for semantic code search, automated workflows, and integrating AI into daily development.
The Model Context Protocol (MCP) is an open standard by Anthropic that lets AI assistants (like Claude) connect to external tools and data sources. Think of it as a plugin system for LLMs — each MCP server exposes tools that the AI can call.
I built 10 of them at Prachyam, and they're the highest-leverage code I've written — not because they're complex, but because they compound across every session. Here's the problem they solved and what I learned building them.
As a one-person engineering team at Prachyam, I was responsible for everything: frontend, backend, infrastructure, email, CI/CD. Context-switching between codebases was constant. I needed AI assistance that understood our specific systems — not generic code completion, but tools that could query our Docker containers, search our codebase semantically, and automate routine operations.
The specific pain: by sprint 40 of the Sangam rewrite, I was spending the first 15–20 minutes of each session reconstructing which packages had changed, what the current container health looked like, and what the last build had broken. That time was not engineering — it was context reconstruction. The MCP servers exist to eliminate it.
mcp-servers/
├── code-search/ # Semantic search with Qdrant
├── docker-ops/ # Container management
├── git-analytics/ # Repo stats and history
├── mailcow-admin/ # Email domain management
├── nextcloud-files/ # File operations
├── deployment/ # CI/CD trigger and status
├── monitoring/ # Health checks and alerts
├── database/ # PostgreSQL query runner
├── redis-ops/ # Cache inspection/management
└── dns-manager/ # DNS record managementEach server is a TypeScript project using the MCP SDK:
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
const server = new Server({
name: "docker-ops",
version: "1.0.0",
}, {
capabilities: { tools: {} },
});
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: "list_containers",
description: "List all Docker containers with status",
inputSchema: {
type: "object",
properties: {
all: { type: "boolean", description: "Include stopped containers" },
},
},
},
{
name: "container_logs",
description: "Get recent logs for a container",
inputSchema: {
type: "object",
properties: {
name: { type: "string", description: "Container name" },
lines: { type: "number", description: "Number of lines" },
},
required: ["name"],
},
},
],
}));The most impactful server was the code search tool. Instead of grep, the AI could search semantically — "find where we handle authentication errors" would return relevant code even if the word "authentication" never appeared.
// Embedding pipeline
async function indexCodebase(repoPath: string) {
const files = await glob("**/*.{ts,tsx,rs,py}", { cwd: repoPath });
for (const file of files) {
const content = await readFile(join(repoPath, file), "utf-8");
const chunks = splitIntoChunks(content, 512); // ~512 token chunks
for (const chunk of chunks) {
const embedding = await getEmbedding(chunk.text);
await qdrant.upsert("code", {
points: [{
id: uuid(),
vector: embedding,
payload: {
file,
text: chunk.text,
startLine: chunk.startLine,
endLine: chunk.endLine,
},
}],
});
}
}
}The search tool then queries Qdrant with the user's natural language question:
async function searchCode(query: string, limit = 5) {
const queryVector = await getEmbedding(query);
const results = await qdrant.search("code", {
vector: queryVector,
limit,
with_payload: true,
});
return results.map(r => ({
file: r.payload.file,
lines: `${r.payload.startLine}-${r.payload.endLine}`,
text: r.payload.text,
score: r.score,
}));
}Here's a typical workflow. I'm debugging a deployment issue and ask Claude:
"Check if the mailcow container is running, then show me the last 50 lines of its logs"
Claude calls two MCP tools sequentially:
docker-ops.list_containers — confirms mailcow is runningdocker-ops.container_logs({ name: "mailcow", lines: 50 }) — fetches logsNo tab-switching, no copy-pasting. The AI has direct access to the same tools I'd use manually.
Composability. Because each server is a standalone process, I can mix and match. Claude can use the code search server alongside the Docker ops server to correlate code changes with container behavior.
Type safety. TypeScript + the MCP SDK's schema validation means tools are well-typed and self-documenting. The AI gets schema information to understand what parameters each tool accepts.
Incremental adoption. I started with 3 servers (code search, Docker, git). Each new one was built when I felt the pain of doing something manually too often.
Better error messages. Early servers returned raw error objects. Now every tool returns human-readable errors that help the AI diagnose issues.
Rate limiting from the start. A few times, Claude got into a loop calling a database query tool repeatedly. Adding request limits per tool per minute solved this.
Qdrant needs maintenance. Vector indices grow. I now run a weekly re-index that prunes deleted files and updates changed ones, rather than only appending.
Building AI tooling is itself a signal. It shows you understand how to integrate AI into real workflows — not just chatbots, but tools that make engineering teams faster. These 10 servers are some of the most impactful code I've written, not because they're complex, but because they multiply my productivity every session. The context reconstruction problem I described above — 15–20 minutes per session — disappeared. That's the kind of engineering that doesn't show in commit counts or LOC but shows in what you actually ship.
Karanveer Singh Shaktawat
Full Stack Engineer & Infrastructure Architect
I build production systems across web, mobile, and infrastructure — then document what went wrong and why.
Pick what you want to hear about — I'll only email when it's worth it.
Did this resonate?
Local LLMs (up to 70B-class), Flux image generation, Kokoro TTS, Whisper STT — all running locally on an M1 Max with 64GB unified memory. The practical setup, benchmarks, and honest assessment of what's ready and what isn't.
What 'vibe coding' actually means in practice — my AI toolkit, when it helps, when it hurts, and the meta angle of building this portfolio with Claude.
How I architected a full-stack OTT streaming platform for web, mobile, four TV OSes, and a desktop admin panel — solo, in one Nx TypeScript monorepo — and what I'd do differently.