🗄️

Intermediate

NAS for AI: Training Data & Model Storage

Set up a NAS as central storage for your AI lab. Share model files, training data, and backups across machines over your LAN.

⏱ ~30 minutes 💻 Mac / Linux / Windows 🗄️ 2-bay NAS + drives

What You'll Need

A 2-bay (or more) NAS — Synology, TerraMaster, or UGREEN
2x hard drives — 4TB+ each recommended (HDD for bulk, SSD for speed)
An Ethernet cable — NAS over WiFi is painfully slow; wire it to your router
A computer on the same local network

💡 Why a NAS for AI? Models are big. Llama 3 70B is 40GB. Training datasets can be 100GB+. A NAS gives you one place to store everything, accessible from any machine on your network — your AI server, your laptop, your Pi. No more copying files between USB drives.

1 Choose Your NAS

NAS	Bays	RAM	Network	Price	Best For
UGREEN NASync DXP2800	2	4GB	2.5GbE	~$200	Budget-friendly starter NAS
TerraMaster F2-223	2	4GB	2.5GbE	~$230	Good value, Docker support
Synology DS224+	2	2GB	1GbE	~$300	Best software ecosystem (DSM)
Synology DS423+	4	2GB	2x 1GbE	~$500	Room to grow, expandable
QNAP TS-264	2	8GB	2.5GbE	~$400	Power user, HDMI out, more RAM

💡 For most people: A 2-bay NAS with 2x 4TB drives in RAID 1 (mirror) gives you 4TB of usable, redundant storage for ~$350 total. That's enough for dozens of large models plus training data.

⚠️ RAID is not backup. RAID 1 protects against a single drive failure, but it won't save you from accidental deletion, ransomware, or the NAS itself dying. Always keep important data in at least two places.

2 Initial Setup & Storage Pool

After inserting your drives and connecting Ethernet:

Find your NAS on the network — most brands have a discovery tool:
# Synology: find.synology.com # TerraMaster: tnas.online # UGREEN: ugreenlink.com # Or just scan your network: arp -a | grep -i "unknown"
Run the setup wizard — create your admin account and let it initialize the drives
Create a storage pool — choose RAID 1 (mirror) for redundancy, or SHR (Synology Hybrid RAID) if available
Create shared folders — set up folders for different purposes:
# Recommended folder structure: /ai-models/ # Ollama models, GGUF files /training-data/ # JSONL, datasets, corpora /backups/ # Machine backups, configs /projects/ # Shared project files

💡 Enable SMB and NFS. SMB (Samba) works best for Mac and Windows. NFS is faster for Linux. Most NAS devices support both — turn them both on in the settings.

3 Mount the NAS from Your AI Machines

Once your NAS has shared folders, mount them on every machine that needs access.

macOS — mount via Finder or terminal:

# Create a mount point
sudo mkdir -p /Volumes/nas-ai

# Mount the SMB share (replace NAS_IP and folder name)
mount_smbfs //your-user@192.168.1.50/ai-models /Volumes/nas-ai

# Verify
ls /Volumes/nas-ai
        

Linux — mount via fstab for auto-mount on boot:

# Install CIFS utilities
sudo apt install cifs-utils

# Create mount point
sudo mkdir -p /mnt/nas-ai

# Create a credentials file (keeps password out of fstab)
sudo nano /etc/nas-credentials
# Contents:
# username=your-user
# password=your-password
sudo chmod 600 /etc/nas-credentials

# Add to /etc/fstab for auto-mount
//192.168.1.50/ai-models  /mnt/nas-ai  cifs  credentials=/etc/nas-credentials,uid=1000,gid=1000  0  0

# Mount now without rebooting
sudo mount -a
        

Windows — map as a network drive:

# In PowerShell or File Explorer → Map Network Drive:
net use Z: \\192.168.1.50\ai-models /persistent:yes
        

4 Point Ollama & Training Data at the NAS

Now redirect your AI tools to use the NAS for storage. This means any machine on your network pulls models from the same place.

Ollama — change the model storage path:

# macOS / Linux — set the environment variable
export OLLAMA_MODELS=/Volumes/nas-ai/ollama

# Make it permanent (add to your shell profile)
echo 'export OLLAMA_MODELS=/Volumes/nas-ai/ollama' >> ~/.zshrc

# Restart Ollama, then pull a model — it goes to the NAS
ollama pull llama3.2:3b
ls /Volumes/nas-ai/ollama/manifests/
        

⚠️ Network speed matters. Loading a 7B model from a 1GbE NAS takes ~5 seconds. From a 2.5GbE NAS, ~2 seconds. Once loaded into RAM, inference speed is the same — the NAS only affects initial load time. For very large models (70B+), consider caching them locally and using the NAS as the source of truth.

Symlink approach — keep tools unchanged, redirect the folder:

# Instead of changing env vars, symlink the default location
# macOS default: ~/.ollama/models
# Linux default:  /usr/share/ollama/.ollama/models

# Move existing models to NAS first
mv ~/.ollama/models/* /Volumes/nas-ai/ollama/

# Create symlink
ln -s /Volumes/nas-ai/ollama ~/.ollama/models
        

Training data — organize on the NAS:

# Recommended training data structure
/training-data/
  ├── finetune/
  │   ├── identity_train.jsonl
  │   ├── reasoning_train.jsonl
  │   └── combined.jsonl
  ├── datasets/
  │   ├── wikipedia-en/
  │   └── code-alpaca/
  └── exports/
      ├── chat_history.jsonl
      └── eval_results.json
        

💡 This pairs with the Model Library guide. That guide covers organizing models on a local NVMe. This guide extends that to network storage — useful when you have multiple machines that all need the same models.

5 Backups & Maintenance

A NAS isn't just storage — it's your safety net. Set up automated backups and health monitoring.

Automated snapshots (Synology):

# Synology → Snapshot Replication → Scheduled Task
# Set daily snapshots on /ai-models/ and /training-data/
# Retention: keep last 7 daily, last 4 weekly
# This lets you roll back if a model gets corrupted or deleted
        

Scrub schedule — check drive health:

# Most NAS UIs have a "Data Scrubbing" option under Storage
# Schedule monthly — it reads every block and verifies checksums
# Catches silent data corruption (bit rot) before it becomes a problem
        

Backup the NAS itself:

External USB drive — plug into the NAS USB port, schedule nightly copy of critical folders
Second NAS — rsync between two NAS devices for full redundancy
Cloud (optional) — Synology Hyper Backup supports Backblaze B2 (~$5/TB/month) for offsite

Monitor drive health from the command line:

# SSH into your NAS (enable SSH in settings first)
ssh admin@192.168.1.50

# Check SMART status on Synology
smartctl -a /dev/sda

# Key fields to watch:
#   Reallocated_Sector_Ct — should be 0
#   Current_Pending_Sector — should be 0
#   Temperature_Celsius — under 45°C
        

Storage Task	Size Estimate	2TB NAS	4TB NAS	8TB NAS
Small models (1B–3B)	~2GB each	~1000	~2000	~4000
Medium models (7B–8B)	~5GB each	~400	~800	~1600
Large models (70B)	~40GB each	~50	~100	~200
Fine-tune JSONL datasets	~10MB–1GB	Thousands	Thousands	Thousands
Chat history exports	~1MB/1000 msgs	Millions	Millions	Millions

✅ What You've Set Up

A NAS with a RAID-protected storage pool and organized shared folders
Network mounts on Mac, Linux, or Windows for seamless access
Ollama models stored centrally — pull once, use from any machine
Training data and exports organized and accessible over LAN
Automated snapshots, scrub schedules, and drive health monitoring

Next Steps

Run Docker on the NAS — some NAS devices (Synology, QNAP) can run containers directly. Host Open WebUI right on the NAS.
Set up a Time Machine target — most NAS devices can act as a Mac backup destination. Protect your whole machine, not just AI files.
Add an SSD cache — if your NAS has M.2 slots, adding an SSD cache dramatically speeds up small file reads (metadata, JSONL files).
Pair with your AI server — the always-on AI server loads models from the NAS, serves them to everything else on your network.
Centralize training data for fine-tuning — keep all your JSONL files on the NAS and point your model library at it.

💡 The full stack: NAS (store) → AI Server (run) → Ollama (think) → Vision + Whisper (perceive). Your NAS is the foundation — the shared brain that every machine in your lab draws from.

What You'll Need

1 Choose Your NAS

2 Initial Setup & Storage Pool

3 Mount the NAS from Your AI Machines

4 Point Ollama & Training Data at the NAS

5 Backups & Maintenance

✅ What You've Set Up

Next Steps

📚 Learning Links

Videos

Official Docs

Community