← Back to all guides
🖥️
Intermediate

Set Up an Always-On AI Server

Configure a mini PC as a dedicated AI machine. Auto-start Ollama on boot, expose the API to your LAN, and access it from any device.

⏱ ~20 minutes 💻 Mini PC / old laptop / NUC 🧠 16GB+ RAM

What You'll Need

  • A mini PC, NUC, or old laptop you can leave running 24/7
  • 16GB RAM minimum (32GB to run multiple or larger models)
  • An Ethernet connection to your router (Wi-Fi works but adds latency)
  • A USB drive or SD card for installing the OS
💡 Don't overthink the hardware. An old laptop with 16GB RAM works fine for running 7B models. A $200 mini PC with 32GB handles most things. You don't need a GPU — CPU inference is good enough for a personal server.

1 Choose Your Hardware

Option Price RAM Best For
Old laptop $0 (reuse) 8-16GB Getting started, testing the concept
Beelink Mini S12 Pro ~$160 16GB Budget pick, quiet, low power (~15W)
Intel NUC / ASUS NUC ~$300 32GB Reliable, compact, good for 8B-13B models
Mac Mini (M-series) ~$500+ 16-24GB unified Best performance per watt, runs large models well
Mini PC + NVIDIA GPU ~$600+ 32GB + VRAM Fastest inference, handles 30B+ models
💡 Power matters. This runs 24/7. A mini PC at 15W costs ~$1.30/month in electricity. An old gaming PC at 200W costs ~$17/month. The Mac Mini at 10-20W is incredibly efficient for its performance.

2 Install the OS

Linux (Ubuntu Server):

  1. Download Ubuntu Server 24.04 LTS
  2. Flash to USB with balenaEtcher or dd
  3. Boot from USB, follow the installer
  4. Enable SSH during install (or sudo apt install openssh-server after)
  5. Set a static IP (we'll need this later)
# Set a static IP (edit netplan config) sudo nano /etc/netplan/01-netcfg.yaml
# Example netplan config network: version: 2 ethernets: eth0: dhcp4: no addresses: [192.168.1.50/24] routes: - to: default via: 192.168.1.1 nameservers: addresses: [1.1.1.1, 8.8.8.8]
sudo netplan apply

Mac Mini: Just use macOS — Ollama runs great natively. Set a static IP in System Settings → Network → Ethernet → Details → TCP/IP → Manually.

💡 Tip: Use Ubuntu Server, not Desktop. No GUI means more RAM for your models. You'll SSH in for everything anyway.

3 Install Ollama & Configure for LAN

# Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Pull your models ollama pull llama3.2 ollama pull mistral

By default, Ollama only listens on 127.0.0.1 (localhost). To access it from other devices on your network, you need to tell it to listen on all interfaces.

Linux (systemd):

# Edit the Ollama service sudo systemctl edit ollama # Add these lines in the editor: [Service] Environment="OLLAMA_HOST=0.0.0.0" # Save, then restart sudo systemctl restart ollama

Mac (launchd):

# Set the environment variable globally launchctl setenv OLLAMA_HOST "0.0.0.0" # Restart Ollama (quit from menu bar, it auto-restarts) # Or restart the service: pkill ollama && ollama serve &

Test from another device on your network:

# From your laptop (replace with your server's IP) curl http://192.168.1.50:11434/api/tags # Should return a JSON list of your models # Chat with a model remotely OLLAMA_HOST=192.168.1.50 ollama run llama3.2 "Hello from across the network"
⚠️ LAN only. Setting OLLAMA_HOST=0.0.0.0 exposes Ollama to your local network, but your router's firewall keeps it off the internet. Do not port-forward 11434 to the internet — there's no authentication on the Ollama API by default.

4 Auto-Start on Boot

The server needs to come back up after power outages and reboots without you touching it.

Linux: Ollama's installer already creates a systemd service. Verify it's enabled:

# Check it starts on boot sudo systemctl enable ollama sudo systemctl status ollama # Also set up auto-login + SSH so you can reach it after reboot sudo systemctl enable ssh

Mac: Ollama auto-starts by default (it's a menu bar app). For a headless Mac Mini, enable auto-login:

# System Settings → General → Login Items → Open at Login # Make sure Ollama is in the list # Enable SSH for remote access # System Settings → General → Sharing → Remote Login → ON

BIOS settings (important for unattended operation):

  • Restore on AC Power Loss: Power On — so it reboots after power cuts
  • Wake on LAN: Enabled — wake it remotely if it sleeps
  • Disable sleep/hibernate — the whole point is always-on
💡 Tip: Plug the server into a cheap UPS ($40-60) to survive brief power flickers. It doesn't need to last hours — just long enough to avoid file system corruption from hard shutdowns.

5 Use It From Anywhere in Your Home

Now any device on your network can talk to your AI server. Here's how to point different tools at it.

Ollama CLI from another machine:

# Set the remote host export OLLAMA_HOST=192.168.1.50 # Now use ollama as if it were local ollama run llama3.2 "Summarize the Roman Empire in 3 sentences" ollama list ollama pull mistral

Open WebUI (browser-based chat):

# Run Open WebUI as a Docker container on the server docker run -d \ --name open-webui \ -p 3000:8080 \ -e OLLAMA_BASE_URL=http://localhost:11434 \ -v open-webui:/app/backend/data \ --restart=unless-stopped \ ghcr.io/open-webui/open-webui:main # Access from any browser: http://192.168.1.50:3000

From Python scripts:

import requests resp = requests.post("http://192.168.1.50:11434/api/chat", json={ "model": "llama3.2", "messages": [{"role": "user", "content": "Hello from my laptop!"}], "stream": False }) print(resp.json()["message"]["content"])

From AI OS:

# In your AI OS .env or config, point at the server OLLAMA_HOST=http://192.168.1.50:11434
💡 Tip: Add a DNS alias on your Pi-hole (if you followed our Pi-hole guide) so you can use ai.local instead of the IP. In Pi-hole admin → Local DNS → DNS Records, add ai.local → 192.168.1.50.

✅ What You've Set Up

  • A dedicated AI server that starts automatically on boot — no monitor or keyboard needed
  • Ollama listening on your LAN so any device can use it
  • Remote access via SSH for maintenance
  • Optional web UI for ChatGPT-like browser access from any device

Next Steps

  • Add your model library — follow our Model Library guide to organize models on NVMe storage attached to the server.
  • Set up Tailscale — access your AI server from outside your home. Tailscale creates a secure mesh VPN with zero port forwarding.
  • Monitor with Grafana — track CPU, RAM, and GPU usage over time. Know when you're pushing the limits.
  • Run AI OS on the server — put the full AI OS stack on your always-on machine so background loops, memory, and feeds run 24/7.
⚠️ Keep it updated. Run sudo apt update && sudo apt upgrade -y weekly (Linux) or enable automatic security updates. An always-on machine on your network should always be patched.

📚 Learning Links

Videos

Official Docs

Community