← Back to all guides
🗄️
Intermediate

NAS for AI: Training Data & Model Storage

Set up a NAS as central storage for your AI lab. Share model files, training data, and backups across machines over your LAN.

⏱ ~30 minutes 💻 Mac / Linux / Windows 🗄️ 2-bay NAS + drives

What You'll Need

  • A 2-bay (or more) NAS — Synology, TerraMaster, or UGREEN
  • 2x hard drives — 4TB+ each recommended (HDD for bulk, SSD for speed)
  • An Ethernet cable — NAS over WiFi is painfully slow; wire it to your router
  • A computer on the same local network
💡 Why a NAS for AI? Models are big. Llama 3 70B is 40GB. Training datasets can be 100GB+. A NAS gives you one place to store everything, accessible from any machine on your network — your AI server, your laptop, your Pi. No more copying files between USB drives.

1 Choose Your NAS

NAS Bays RAM Network Price Best For
UGREEN NASync DXP2800 2 4GB 2.5GbE ~$200 Budget-friendly starter NAS
TerraMaster F2-223 2 4GB 2.5GbE ~$230 Good value, Docker support
Synology DS224+ 2 2GB 1GbE ~$300 Best software ecosystem (DSM)
Synology DS423+ 4 2GB 2x 1GbE ~$500 Room to grow, expandable
QNAP TS-264 2 8GB 2.5GbE ~$400 Power user, HDMI out, more RAM
💡 For most people: A 2-bay NAS with 2x 4TB drives in RAID 1 (mirror) gives you 4TB of usable, redundant storage for ~$350 total. That's enough for dozens of large models plus training data.
⚠️ RAID is not backup. RAID 1 protects against a single drive failure, but it won't save you from accidental deletion, ransomware, or the NAS itself dying. Always keep important data in at least two places.

2 Initial Setup & Storage Pool

After inserting your drives and connecting Ethernet:

  1. Find your NAS on the network — most brands have a discovery tool:
    # Synology: find.synology.com # TerraMaster: tnas.online # UGREEN: ugreenlink.com # Or just scan your network: arp -a | grep -i "unknown"
  2. Run the setup wizard — create your admin account and let it initialize the drives
  3. Create a storage pool — choose RAID 1 (mirror) for redundancy, or SHR (Synology Hybrid RAID) if available
  4. Create shared folders — set up folders for different purposes:
    # Recommended folder structure: /ai-models/ # Ollama models, GGUF files /training-data/ # JSONL, datasets, corpora /backups/ # Machine backups, configs /projects/ # Shared project files
💡 Enable SMB and NFS. SMB (Samba) works best for Mac and Windows. NFS is faster for Linux. Most NAS devices support both — turn them both on in the settings.

3 Mount the NAS from Your AI Machines

Once your NAS has shared folders, mount them on every machine that needs access.

macOS — mount via Finder or terminal:

# Create a mount point sudo mkdir -p /Volumes/nas-ai # Mount the SMB share (replace NAS_IP and folder name) mount_smbfs //your-user@192.168.1.50/ai-models /Volumes/nas-ai # Verify ls /Volumes/nas-ai

Linux — mount via fstab for auto-mount on boot:

# Install CIFS utilities sudo apt install cifs-utils # Create mount point sudo mkdir -p /mnt/nas-ai # Create a credentials file (keeps password out of fstab) sudo nano /etc/nas-credentials # Contents: # username=your-user # password=your-password sudo chmod 600 /etc/nas-credentials # Add to /etc/fstab for auto-mount //192.168.1.50/ai-models /mnt/nas-ai cifs credentials=/etc/nas-credentials,uid=1000,gid=1000 0 0 # Mount now without rebooting sudo mount -a

Windows — map as a network drive:

# In PowerShell or File Explorer → Map Network Drive: net use Z: \\192.168.1.50\ai-models /persistent:yes

4 Point Ollama & Training Data at the NAS

Now redirect your AI tools to use the NAS for storage. This means any machine on your network pulls models from the same place.

Ollama — change the model storage path:

# macOS / Linux — set the environment variable export OLLAMA_MODELS=/Volumes/nas-ai/ollama # Make it permanent (add to your shell profile) echo 'export OLLAMA_MODELS=/Volumes/nas-ai/ollama' >> ~/.zshrc # Restart Ollama, then pull a model — it goes to the NAS ollama pull llama3.2:3b ls /Volumes/nas-ai/ollama/manifests/
⚠️ Network speed matters. Loading a 7B model from a 1GbE NAS takes ~5 seconds. From a 2.5GbE NAS, ~2 seconds. Once loaded into RAM, inference speed is the same — the NAS only affects initial load time. For very large models (70B+), consider caching them locally and using the NAS as the source of truth.

Symlink approach — keep tools unchanged, redirect the folder:

# Instead of changing env vars, symlink the default location # macOS default: ~/.ollama/models # Linux default: /usr/share/ollama/.ollama/models # Move existing models to NAS first mv ~/.ollama/models/* /Volumes/nas-ai/ollama/ # Create symlink ln -s /Volumes/nas-ai/ollama ~/.ollama/models

Training data — organize on the NAS:

# Recommended training data structure /training-data/ ├── finetune/ │ ├── identity_train.jsonl │ ├── reasoning_train.jsonl │ └── combined.jsonl ├── datasets/ │ ├── wikipedia-en/ │ └── code-alpaca/ └── exports/ ├── chat_history.jsonl └── eval_results.json
💡 This pairs with the Model Library guide. That guide covers organizing models on a local NVMe. This guide extends that to network storage — useful when you have multiple machines that all need the same models.

5 Backups & Maintenance

A NAS isn't just storage — it's your safety net. Set up automated backups and health monitoring.

Automated snapshots (Synology):

# Synology → Snapshot Replication → Scheduled Task # Set daily snapshots on /ai-models/ and /training-data/ # Retention: keep last 7 daily, last 4 weekly # This lets you roll back if a model gets corrupted or deleted

Scrub schedule — check drive health:

# Most NAS UIs have a "Data Scrubbing" option under Storage # Schedule monthly — it reads every block and verifies checksums # Catches silent data corruption (bit rot) before it becomes a problem

Backup the NAS itself:

  • External USB drive — plug into the NAS USB port, schedule nightly copy of critical folders
  • Second NAS — rsync between two NAS devices for full redundancy
  • Cloud (optional) — Synology Hyper Backup supports Backblaze B2 (~$5/TB/month) for offsite

Monitor drive health from the command line:

# SSH into your NAS (enable SSH in settings first) ssh admin@192.168.1.50 # Check SMART status on Synology smartctl -a /dev/sda # Key fields to watch: # Reallocated_Sector_Ct — should be 0 # Current_Pending_Sector — should be 0 # Temperature_Celsius — under 45°C
Storage Task Size Estimate 2TB NAS 4TB NAS 8TB NAS
Small models (1B–3B) ~2GB each ~1000 ~2000 ~4000
Medium models (7B–8B) ~5GB each ~400 ~800 ~1600
Large models (70B) ~40GB each ~50 ~100 ~200
Fine-tune JSONL datasets ~10MB–1GB Thousands Thousands Thousands
Chat history exports ~1MB/1000 msgs Millions Millions Millions

✅ What You've Set Up

  • A NAS with a RAID-protected storage pool and organized shared folders
  • Network mounts on Mac, Linux, or Windows for seamless access
  • Ollama models stored centrally — pull once, use from any machine
  • Training data and exports organized and accessible over LAN
  • Automated snapshots, scrub schedules, and drive health monitoring

Next Steps

  • Run Docker on the NAS — some NAS devices (Synology, QNAP) can run containers directly. Host Open WebUI right on the NAS.
  • Set up a Time Machine target — most NAS devices can act as a Mac backup destination. Protect your whole machine, not just AI files.
  • Add an SSD cache — if your NAS has M.2 slots, adding an SSD cache dramatically speeds up small file reads (metadata, JSONL files).
  • Pair with your AI server — the always-on AI server loads models from the NAS, serves them to everything else on your network.
  • Centralize training data for fine-tuning — keep all your JSONL files on the NAS and point your model library at it.
💡 The full stack: NAS (store) → AI Server (run) → Ollama (think) → Vision + Whisper (perceive). Your NAS is the foundation — the shared brain that every machine in your lab draws from.

📚 Learning Links

Videos

Official Docs

Community