KeyboardCrumbs — Homelab Guides: Proxmox, Docker, Tailscale, Self-Hosting

1

Proxmox VE Setup

Install Proxmox VE, configure your first virtual machines and containers, set up storage pools, and understand clustering basics. The foundation of every serious homelab.

Why Proxmox?

Proxmox Virtual Environment (VE) is a free, open-source hypervisor based on Debian Linux. It combines KVM for full virtual machines and LXC for lightweight containers in a single web interface. Unlike ESXi (which went paid-only in 2024), Proxmox is completely free for homelab use.

It’s what most homelabbers run, and for good reason: it’s stable, well-documented, and has an active community.

Installation

Download the ISO from proxmox.com/downloads
Flash to USB with Rufus, Balena Etcher, or dd
Boot from USB, follow the installer (choose ZFS mirror if you have two drives)
Access the web UI at https://your-ip:8006

First thing after install: Disable the enterprise repo and enable the no-subscription repo. Edit /etc/apt/sources.list.d/pve-enterprise.list and comment out the enterprise line, then add the free repo.

VMs vs Containers

VMs (KVM) — full virtualization. Run any OS (Windows, Linux, BSD). Higher overhead but complete isolation. Use for: Windows, anything needing a full kernel, untrusted workloads
LXC Containers — lightweight, share the host kernel. Lower overhead, faster startup. Use for: Linux services (Pi-hole, Nginx, databases), anything that doesn’t need a separate kernel

Rule of thumb: use containers for everything you can, VMs for everything else.

Storage Configuration

local — default storage on the boot drive. Good for ISOs and templates
local-lvm — LVM thin pool for VM disks. Default for new VMs
ZFS — if you chose ZFS during install, you get snapshots, compression, and data integrity for free
NFS/CIFS — add network shares for backups or shared storage

Clustering

If you have 3+ Proxmox nodes, you can create a cluster for:

Centralized management (one web UI for all nodes)
Live migration (move running VMs between nodes with zero downtime)
High Availability (auto-restart VMs on another node if one fails)

Important: Never create a cluster with fewer than 3 nodes. A 2-node cluster loses quorum if one node goes down, which can cause split-brain and data corruption.

2

Docker Fundamentals

Containers, Compose files, networking, volumes, and best practices. Docker is the engine that runs most self-hosted services — understanding it properly saves countless hours of debugging.

Containers vs VMs

A container is not a lightweight VM. It’s an isolated process (or group of processes) that shares the host’s kernel but has its own filesystem, network, and process space. Think of it as chroot on steroids with cgroups for resource limits and namespaces for isolation.

Docker Compose

Running docker run commands with 20 flags is unmaintainable. Docker Compose lets you define multi-container applications in a YAML file:

Define services, networks, and volumes in one file
docker compose up -d starts everything
docker compose down stops and removes containers
docker compose pull updates images

Best practice: One docker-compose.yml per service (not one giant file for everything). This lets you update, restart, and manage services independently.

Networking

Bridge (default) — containers get internal IPs, publish ports to host with -p
Host — container uses host’s network stack directly. No port mapping needed. Use sparingly
Macvlan — container gets its own IP on your LAN. Useful for services that need to be on the same subnet as physical devices

Create custom bridge networks for inter-container communication. Containers on the same custom network can resolve each other by container name.

Volumes and Data Persistence

Named volumes — Docker manages the storage. Use for databases and application data
Bind mounts — map a host directory into the container. Use for config files and data you want to manage directly

Never store important data only inside a container. Containers are ephemeral. If you docker compose down -v or rebuild, data in unnamed volumes is gone. Always use named volumes or bind mounts for anything you care about.

Essential Commands

docker compose logs -f service_name — tail logs
docker compose exec service_name sh — shell into running container
docker system prune -a — reclaim disk space (removes unused images, containers, networks)
docker stats — live resource usage per container

3

Tailscale Mesh VPN

Zero-config WireGuard VPN that connects all your devices into a private mesh network. Exit nodes, subnet routing, ACLs, and MagicDNS — remote access to your homelab without port forwarding.

Why Tailscale?

Tailscale wraps WireGuard in a zero-config coordination layer. Every device gets a stable 100.x.x.x IP address. Devices find each other through Tailscale’s coordination server, but traffic flows directly peer-to-peer (encrypted with WireGuard). No port forwarding, no dynamic DNS, no firewall holes.

Setup

Install on each device: curl -fsSL https://tailscale.com/install.sh | sh
Authenticate: sudo tailscale up
Done. Every device can now reach every other device via 100.x.x.x

Key Features for Homelabs

Subnet routing — advertise your LAN subnet so Tailscale devices can reach non-Tailscale devices on your network: tailscale up --advertise-routes=10.10.10.0/24
Exit nodes — route all internet traffic through a specific machine. Great for using your home IP while traveling: tailscale up --advertise-exit-node
MagicDNS — access devices by hostname instead of IP: ssh user@proxmox instead of ssh [email protected]
ACLs — control which devices can talk to which. Restrict your IoT VLAN from reaching your NAS, for example
Funnel — expose a service to the public internet through Tailscale’s edge (no port forwarding needed)

Free tier is generous: 100 devices, 3 users, all core features. For a homelab, you’ll likely never need to pay.

Alternative: Headscale

If you don’t want to depend on Tailscale’s coordination server, Headscale is an open-source, self-hosted implementation of the Tailscale coordination server. Same WireGuard mesh, but you control everything. Trade-off: more setup, more maintenance, but full sovereignty over your network.

4

Self-Hosting Essentials

What to self-host, domain setup, DNS configuration, Cloudflare as a reverse proxy, and the practical trade-offs of running your own services vs using SaaS.

What’s Worth Self-Hosting?

Not everything should be self-hosted. The best candidates are services where you gain privacy, control, or cost savings without taking on unacceptable risk:

High value: Nextcloud (files), Vaultwarden (passwords), Immich (photos), Jellyfin (media), Gitea (code), Paperless-ngx (documents)
Medium value: Pi-hole/AdGuard (DNS), Uptime Kuma (monitoring), Bookstack (wiki), Mealie (recipes)
Think twice: Email (deliverability is painful), public-facing websites (DDoS risk), anything your family relies on without a backup plan

Domain and DNS Setup

Buy a domain (Cloudflare Registrar, Porkbun, or Namecheap)
Point nameservers to Cloudflare (even if you bought elsewhere)
Create A records for your services: cloud.yourdomain.com, git.yourdomain.com, etc.
Enable Cloudflare proxy (orange cloud) for DDoS protection and SSL

Cloudflare Tunnels

Cloudflare Tunnels (formerly Argo Tunnels) let you expose internal services to the internet without opening any ports on your router. A lightweight daemon (cloudflared) runs on your server and creates an outbound-only connection to Cloudflare’s edge.

This is the recommended approach for exposing services to the internet. No port forwarding, no dynamic DNS, and Cloudflare handles SSL, DDoS protection, and bot filtering for free.

The Self-Hosting Trade-Off

Self-hosting gives you control and privacy but costs you time and reliability. You’re the sysadmin now — if it breaks at 2 AM, it’s your problem. Always have a plan for when (not if) hardware fails:

Automated backups (see Guide 7)
Monitoring and alerts (see Guide 6)
A fallback for critical services (can you survive without your NAS for a week?)

5

NAS & Storage

TrueNAS, ZFS fundamentals, RAID levels, snapshot-based backup design, and why ECC RAM actually matters. Your data is only as safe as your storage strategy.

ZFS: The Homelab Filesystem

ZFS is a combined filesystem and volume manager that provides data integrity verification, snapshots, compression, and redundancy. It’s the default choice for NAS systems because it catches and corrects silent data corruption (bit rot) that other filesystems miss.

Checksumming — every block is checksummed. ZFS detects corruption on read and auto-heals from redundant copies
Copy-on-write — data is never overwritten in place. Writes go to new blocks, then the pointer is updated atomically. No partial writes, no fsck needed
Snapshots — instant, zero-cost point-in-time copies. Roll back a dataset to any snapshot in seconds
Compression — LZ4 compression is nearly free (CPU-wise) and saves 30-50% on typical data

RAID Levels (ZFS Pools)

Mirror (RAID 1) — 2+ drives, each is an exact copy. 50% usable space. Fast reads. Best for small setups (2-4 drives)
RAIDZ1 (RAID 5) — single parity. Can lose 1 drive. Good for 3-5 drives. Minimum recommended for data you care about
RAIDZ2 (RAID 6) — double parity. Can lose 2 drives. Recommended for 6+ drives or large capacity drives (rebuild times are long)
RAIDZ3 — triple parity. For enterprise or very large arrays

RAID is not backup. RAID protects against drive failure. It does NOT protect against accidental deletion, ransomware, fire, theft, or controller failure. You still need backups.

TrueNAS Setup Tips

Use TrueNAS SCALE (Linux-based) over CORE (FreeBSD) for better Docker/container support
Give ZFS as much RAM as you can. The ARC (Adaptive Replacement Cache) dramatically improves read performance
Enable automated snapshots: hourly for 24h, daily for 30 days, weekly for 6 months
Set up automated scrubs (monthly) to detect and repair bit rot

6

Monitoring Stack

Grafana + Prometheus + node_exporter + alerting. See everything happening in your homelab with dashboards, graphs, and alerts that actually tell you when something breaks.

The Stack

Prometheus — time-series database that scrapes metrics from exporters at regular intervals (typically 15s)
node_exporter — exposes Linux host metrics (CPU, RAM, disk, network) on port 9100
Grafana — visualization platform. Connects to Prometheus as a data source and renders dashboards
Alertmanager — handles alert routing, deduplication, and notification (email, Slack, Discord, PagerDuty)

Quick Setup with Docker Compose

The entire monitoring stack can run in Docker containers:

Prometheus: scrapes metrics, stores time-series data
node_exporter: runs on each host you want to monitor
Grafana: web UI on port 3000, import pre-built dashboards
cAdvisor: monitors Docker container resource usage

Start with dashboard ID 1860 (Node Exporter Full) from Grafana’s dashboard marketplace. It gives you comprehensive host monitoring out of the box. Then customize from there.

What to Monitor

Host metrics: CPU, RAM, disk I/O, network, filesystem usage, temperature
Container metrics: per-container CPU, memory, network (via cAdvisor)
Service health: Uptime Kuma or Blackbox exporter for HTTP/TCP checks
Smart disk health: smartctl_exporter for drive failure prediction
UPS status: NUT (Network UPS Tools) exporter for battery monitoring

Alerting That Matters

Only alert on things that require action. Good alerts:

Disk usage above 85% (time to clean up or expand)
Host unreachable for 2+ minutes
RAM usage above 90% sustained for 10+ minutes
SMART disk warning (replace the drive)
UPS on battery (power outage)

7

Backup Strategy

The 3-2-1 rule, Borg vs Restic, offsite options, and automation. The backup you never test is the backup that fails when you need it most.

The 3-2-1 Rule

3 copies of your data (original + 2 backups)
2 different storage media (e.g., SSD + HDD, or local + cloud)
1 offsite copy (protects against fire, theft, natural disaster)

This isn’t paranoia — it’s math. Any single backup has a non-trivial failure probability. Two independent copies reduce risk exponentially.

Borg Backup

Borg is a deduplicating, encrypting backup program. It’s fast, space-efficient, and battle-tested:

Deduplication at the chunk level — only new/changed data is stored
Client-side encryption (AES-256) — the backup server never sees your data
Compression (LZ4, ZSTD, LZMA) — further reduces storage
Mount any backup as a FUSE filesystem to browse and restore individual files

Restic

Restic is similar to Borg but with native support for cloud backends:

Supports S3, B2, Azure, Google Cloud, SFTP, and local storage
Faster for cloud-based offsite backups
Single binary, no dependencies
Built-in data integrity verification

My recommendation: Borg for local/LAN backups (faster, more mature). Restic for offsite/cloud backups (native cloud support). Use both for 3-2-1 compliance.

Offsite Options

Backblaze B2 — $6/TB/month. The cheapest S3-compatible storage. Use with Restic or rclone
Hetzner Storage Box — cheap BorgBackup-compatible remote storage (SSH/SFTP)
Another physical location — a friend’s house, a small NAS at a relative’s place, encrypted and synced nightly

Automation and Testing

A backup that isn’t automated will eventually stop happening. A backup that isn’t tested will eventually fail to restore.

Cron jobs or systemd timers for daily backups
Monitor backup completion with your alerting stack
Test restores quarterly — actually restore files and verify them
Document the restore process. When you need it, you’ll be stressed and possibly sleep-deprived

8

Network Segmentation

VLANs, firewall rules, IoT isolation, and pfSense/OPNsense configuration. Keep your smart devices from talking to your NAS, and your guest network from seeing your servers.

Why Segment?

A flat network means every device can talk to every other device. Your kid’s IoT toy, your smart TV, and your NAS with irreplaceable family photos are all on the same network. If any device is compromised, everything is reachable.

VLANs create virtual network segments that act like separate physical networks. Firewall rules control what traffic can flow between them.

Recommended VLAN Layout

VLAN 10 — Management (10.10.10.0/24): Proxmox hosts, switches, access points, router admin. Restricted access
VLAN 20 — Servers (10.10.20.0/24): Docker hosts, VMs running services
VLAN 30 — Trusted (10.10.30.0/24): Your personal devices (laptop, phone, desktop)
VLAN 40 — IoT (10.10.40.0/24): Smart home devices, cameras, thermostats. Internet access only — no access to other VLANs
VLAN 50 — Guest (10.10.50.0/24): Guest WiFi. Internet only, nothing else

Firewall Rules (pfSense/OPNsense)

The basic principle: deny everything by default, then allow specific traffic:

Trusted → Servers: Allow (you need to access your services)
Trusted → Management: Allow (you need to admin your infrastructure)
IoT → Internet: Allow (smart devices need cloud access)
IoT → anything else: Deny (contain compromised devices)
Guest → Internet: Allow
Guest → anything else: Deny
Servers → Internet: Allow specific (updates, API calls)

Hardware Requirements

A managed switch that supports 802.1Q VLANs (TP-Link, Ubiquiti, Mikrotik)
A router/firewall that can route between VLANs (pfSense, OPNsense, Mikrotik)
WiFi access points that support multiple SSIDs mapped to VLANs

Start simple: Even just separating IoT from everything else is a massive security improvement. You can add more VLANs later as your network grows.

9

Reverse Proxy

Nginx Proxy Manager, Caddy, SSL certificates, and wildcard subdomains. Access all your services via clean URLs with automatic HTTPS instead of remembering IP:port combos.

What a Reverse Proxy Does

Instead of accessing services by IP and port (192.168.1.50:8096), a reverse proxy lets you use clean URLs (jellyfin.home.lab) with automatic HTTPS. One entry point (ports 80/443) routes to the correct backend based on the hostname.

Option 1: Nginx Proxy Manager (NPM)

The easiest option for homelabs. Web-based GUI for configuring Nginx reverse proxy:

Point-and-click SSL certificate management (Let’s Encrypt)
Wildcard certificate support via DNS challenge
Access lists for restricting services to specific IPs
Custom Nginx config snippets for advanced use cases

Option 2: Caddy

Caddy automatically provisions HTTPS for every site. Zero configuration for SSL:

Automatic Let’s Encrypt certificates (no manual setup)
Simple Caddyfile syntax (5 lines vs 50 for Nginx)
Automatic HTTP → HTTPS redirect
Built-in reverse proxy, file server, and load balancer

Wildcard Certificates

Instead of getting a separate certificate for each subdomain, get one wildcard cert (*.home.yourdomain.com) that covers everything:

Use the DNS challenge method (not HTTP challenge)
Configure your DNS provider’s API credentials in NPM or Caddy
One cert covers grafana.home.yourdomain.com, jellyfin.home.yourdomain.com, etc.

Internal vs External Access

Internal only: Split-brain DNS (your local DNS resolves *.home.lab to your proxy’s LAN IP). Services never touch the internet
External access: Use Cloudflare Tunnels (Guide 4) or port forward 443 to your proxy. Always use authentication (Authelia, Authentik) for public-facing services

Never expose a service to the internet without authentication. Even “harmless” services get probed by bots within minutes of going public. Use SSO (Authelia, Authentik) or at minimum HTTP basic auth.

10

GPU Passthrough

IOMMU groups, PCIe passthrough for AI inference and gaming VMs, VFIO configuration, and troubleshooting. Give a virtual machine direct access to your GPU for near-native performance.

What Is GPU Passthrough?

GPU passthrough (PCI passthrough) gives a virtual machine direct, exclusive access to a physical GPU. The VM sees the GPU as if it were running on bare metal. This enables:

Gaming VMs — run Windows games at near-native performance from a Linux host
AI inference — run Ollama, llama.cpp, or Stable Diffusion with full GPU acceleration in a VM or container
Transcoding — hardware-accelerated video transcoding for Jellyfin/Plex

Requirements

CPU with IOMMU support: Intel VT-d or AMD-Vi. Most modern CPUs have this
Motherboard with good IOMMU groups: each device should be in its own group. Consumer boards are hit-or-miss
Two GPUs: one for the host (can be integrated/iGPU), one for passthrough. The passed-through GPU is exclusively owned by the VM

Setup on Proxmox

Enable IOMMU in BIOS (Intel VT-d / AMD-Vi / IOMMU)
Add kernel parameters: intel_iommu=on iommu=pt (or amd_iommu=on) to /etc/default/grub
Load VFIO modules: add vfio vfio_iommu_type1 vfio_pci vfio_virqfd to /etc/modules
Blacklist the GPU driver: prevent the host from claiming the GPU (blacklist nouveau or blacklist nvidia)
Bind the GPU to VFIO: add the GPU’s PCI IDs to /etc/modprobe.d/vfio.conf
Add the PCI device to your VM in Proxmox (Hardware → Add → PCI Device)

Common Issues

IOMMU group too large: the GPU shares a group with other devices. Use the ACS override patch (risky) or try a different PCIe slot
Code 43 (NVIDIA): NVIDIA drivers detect they’re in a VM and refuse to work. Add args: -cpu 'host,hv_vendor_id=proxmox' to the VM config
Reset bug: some GPUs don’t reset properly when the VM shuts down, requiring a host reboot. AMD GPUs are generally better at this than NVIDIA

For AI workloads: You don’t always need full passthrough. NVIDIA GPUs support vGPU (with a license) or you can use Docker with --gpus all to share the GPU between containers without passthrough. Much simpler for inference workloads.

11

Home Automation

Home Assistant, Zigbee/Z-Wave, MQTT, automations, and dashboards. Turn your homelab into a smart home that actually works — locally, privately, without cloud dependencies.

Why Home Assistant?

Home Assistant (HA) is an open-source home automation platform that runs locally. It integrates with 2,000+ devices and services. Unlike Google Home or Alexa, your data stays in your house, and automations work even when the internet is down.

Installation Options

Home Assistant OS (HAOS) — dedicated VM or Raspberry Pi. Includes Supervisor for easy add-on management. The recommended option for most people
Docker container — runs alongside other containers. No Supervisor (install add-ons manually). For experienced Docker users who want more control
Proxmox VM — run HAOS as a VM on your Proxmox host. Best of both worlds: dedicated environment with Supervisor, plus the flexibility of virtualization

Protocols: Zigbee, Z-Wave, WiFi

Zigbee — low-power mesh network. Devices relay signals through each other. Needs a coordinator (SONOFF Zigbee 3.0 dongle, $15). Best protocol for most smart home devices. Use with Zigbee2MQTT for maximum compatibility
Z-Wave — similar to Zigbee but with a different frequency band (fewer interference issues). More expensive devices but very reliable. Good for locks and critical devices
WiFi — easy setup but each device adds load to your WiFi network. Devices using ESPHome (custom firmware) are excellent because they operate locally without cloud
Matter/Thread — the new standard backed by Apple, Google, Amazon. Still maturing but promises universal compatibility

MQTT

MQTT is a lightweight messaging protocol that many IoT devices use to communicate. Run a broker (Mosquitto) and devices publish/subscribe to topics:

zigbee2mqtt/living_room/temperature → sensor publishes temperature readings
homeassistant/light/bedroom/set → HA publishes to turn on a light
Decouples devices from each other — any device can listen to any topic

Automations That Actually Help

Turn on lights at sunset, off at bedtime (based on your phone’s presence)
Send a notification if a door is left open for more than 5 minutes
Adjust thermostat based on occupancy (motion sensors + phone presence)
Turn off everything when the last person leaves (presence detection)
Morning routine: lights at 30% → coffee machine on → weather briefing on a dashboard

Put all smart home devices on the IoT VLAN (Guide 8). Many cheap IoT devices phone home to servers in China with zero security. Segment them from your main network and only allow internet access, not LAN access.