The stack
Six RackNerd VPS (~₹14,000/year, ~$151 total) running Mailcow (~20M emails, 12 domains, rotating IPs), plus Nextcloud (22-user collaboration, 5TB storage), a 3-node Tailscale mesh for development, and 10 custom MCP servers with Qdrant vector search indexing the entire codebase. My estimate of what the email side alone replaced — a Mailchimp enterprise quote at this volume — is around ₹2.78cr (~$300k), medium confidence, since it's a projected quote rather than a realized bill. See Lessons from Infrastructure for the philosophy behind the decision.
What it replaced
Google Workspace for the 12 domains, LucidLink for file sync, Mailchimp-scale managed email for marketing sends, and cloud AI APIs for development tooling. The replacements were not drop-in equivalents — each required understanding the underlying system well enough to operate it. That understanding is the actual return on the investment, separate from the cost savings.
Mail server architecture
Six RackNerd VPS nodes, twelve custom sending domains, Mailcow as the mail stack on each. The critical architectural decision: separate the transactional mail path from the bulk marketing path. Same-server commingling caused the queue starvation incident documented in the Mailcow Deep Dive series — marketing bulk sends filling the Postfix active queue while password reset emails waited behind them.
Postfix transport maps route transactional mail through an internal transport with no artificial rate delay and high concurrency. Marketing sends route through a bulk transport with per-destination throttling. Two independent queues that cannot starve each other.
SPF, DKIM, and DMARC on every domain. Rspamd for spam scoring. IP warmup on every new node — six weeks of ramping volume, sending to engaged users first. Zero IP blacklistings across the campaign run.
The development mesh
Three machines on a Tailscale mesh: the M1 iMac as build and coding machine, two other nodes running the Docker service stack. dnsmasq on the iMac resolves *.dev.prachyam.local to Tailscale node IPs; Caddy on each service node handles TLS termination and reverse proxying. Every service gets a stable HTTPS subdomain reachable from the build machine without port numbers or browser security warnings. (The public coming-soon site lives separately at prachyam.willmakeitsoon.com.)
The wildcard TLS uses DNS-01 ACME challenge via the Cloudflare API — the only ACME challenge type that works for localhost HTTPS. One wildcard certificate covers every current and future subdomain. No per-service certificate management.
The practical result: OAuth flows work in development. Secure cookies work. Service workers work. Staging exists to validate business logic, not to reproduce the HTTPS environment.
AI tooling
10 custom MCP servers handle the contexts I'd otherwise reconstruct manually at the start of each session: ott-context-mcp for architectural decisions and fix patterns per package, drizzle-studio-mcp for live DB inspection, infra-monitor-mcp for Docker service health, monitoring-mcp for Prometheus queries, api-test-mcp for endpoint testing, video-monitor-mcp for transcode job status, code-search-mcp for Qdrant semantic search, docker-ops-mcp for container management, git-analytics-mcp for repo stats, and mailcow-admin-mcp for domain management.
Qdrant indexes the entire Sangam monorepo — semantic search across ~971K lines of TypeScript, Rust, Dart, and shell. "Find where we handle authentication errors" returns relevant code even when the word "authentication" doesn't appear. See Lessons from Infrastructure for why this is worth the setup cost.
The lesson
Self-hosting is a commitment. You're the sysadmin. Automated backups, monitoring alerts, and written runbooks are not optional — they're what makes the system survivable when something breaks at 2am, which it will. The knowledge that accumulates from running these systems is the dividend that compounds after the cost savings have already paid back.